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1.0 Background of the Invention 

1.1 Field of the Invention 

This invention relates to transformed host cells and vectors which comprise nu- 
cleic acid segments encoding genetically-engineered, recombinant Bacillus thuringiensis 5- 
5 endotoxins which are active against Coleopteran insects. 

1 .2 Description of the Related Art 

Almost all field crops, plants, and commercial farming areas are susceptible to 
attack by one or more insect pests. Particularly problematic are Coleopteran and Lepidoptern 

10 pests. For example, vegetable and cole crops such as artichokes, kohlrabi, arugula, leeks, as- 
paragus, lentils, beans, lettuce {e.g., head, leaf, romaine), beets, bok choy, malanga, broccoli, 
melons (e.g., muskmelon, watermelon, crenshaw, honey dew,cantaloupe), brussels sprouts, 
cabbage, cardoni, carrots, napa, cauliflower, okra, onions, celery, parsley, chick peas, pars- 
nips, chicory, peas, Chinese cabbage, peppers, collards, potatoes, cucumber, pumpkins, cu- 

15 curbits, radishes, dry bulb onions, rutabaga, eggplant, salsify, escarole, shallots, endive, soy- 
bean, garlic, spinach, green onions, squash, greens, sugar beets, sweet potatoes, turnip, swiss 
chard, horseradish, tomatoes, kale, turnips, and a variety of spices are sensitive to infestation 
by one or more of the following insect pests: alfalfa looper, armyworm, beet armyworm, arti- 
choke plume moth, cabbage budworm, cabbage looper, cabbage webworm, corn earworm, 

20 celery leafeater, cross-striped cabbageworm, european corn borer, diamondback moth, green 
cloverworm, imported cabbageworm, melonworm, omnivorous leafroller, pickleworm, rind- 
worm complex, saltmarsh caterpillar, soybean looper, tobacco budworm, tomato fruitworm, 
tomato hornworm, tomato pinworm, velvetbean caterpillar, and yellowstriped armyworm. 
Likewise, pasture and hay crops such as alfalfa, pasture grasses and silage are often attacked 

25 by such pests as armyworm, beef armyworm, alfalfa caterpillar, European skipper, a variety 
of loopers and webworms, as well as yellowstriped armyworms. 

Fruit and vine crops such as apples, apricots, cherries, nectarines, peaches, pears, 
plums, prunes, quince almonds, chestnuts, filberts, pecans, pistachios, walnuts, citrus, black- 
berries, blueberries, boysenberries, cranberries, currants, loganberries, raspberries, strawber- 

30 ries, grapes, avocados, bananas, kiwi, persimmons, pomegranate, pineapple, tropical fruits 

-2- 

A: I3553S(2WKV01!DOC) 



are often susceptible to attack and defoliation by achema sphinx moth, amorbia, armyworm, 
citrus cutworm, banana skipper, blackheaded fireworm, blueberry leafroller, cankerworm, 
cherry fruitworm, citrus cutworm, cranberry girdler, eastern tent caterpillar, fall webworm, 
fall webworm, filbert leafroller, filbert webworm, fruit tree leafroller, grape berry moth, 
5 grape leaffolder, grapeleaf skeletonizer, green fruitworm, gummosos-batrachedra commosae, 
gypsy moth, hickory shuckworm, hornworms, loopers, navel orangeworm, obliquebanded 
leafroller, omnivorous leafroller. omnivorous looper, orange tortrix, orangedog, oriental fruit 
moth, pandemis leafroller, peach twig borer, pecan nut casebearer, redbanded leafroller, red- 
humped caterpillar, roughskinned cutworm, saltmarsh caterpillar, spanworm, tent caterpillar, 

10 thecla-thecla basillides, tobacco budworm, tortrix moth, tufted apple budmoth, variegated 
leafroller, walnut caterpillar, western tent caterpillar, and yellowstriped armyworm. 

Field crops such as canola/rape seed, evening primrose, meadow foam, corn 
(field, sweet, popcorn), cotton, hops, jojoba, peanuts, rice, safflower, small grains (barley, 
oats, rye, wheat, etc), sorghum, soybeans, sunflowers, and tobacco are often targets for infes- 

15 tation by insects including armyworm, asian and other corn borers, banded sunflower moth, 
beet armyworm, bollworm, cabbage looper, corn rootworm (including southern and western 
varieties), cotton leaf perforator, diamondback moth, european corn borer, green cloverworm, 
headmoth, head worm, imported cabbage worm, loopers (including Anacamptodes spp.), 
obliquebanded leafroller, omnivorous leaftier, podworm, podworm, saltmarsh caterpillar, 

20 southwestern corn borer, soybean looper, spotted cutworm, sunflower moth, tobacco bud- 
worm, tobacco hornworm, velvetbean caterpillar, 

Bedding plants, flowers, ornamentals, vegetables and container stock are fre- 
quently fed upon by a host of insect pests such as armyworm, azalea moth, beet armyworm, 
diamondback moth, ello moth (hornworm), Florida fern caterpillar, Io moth, loopers, olean- 

25 der moth, omnivorous leafroller, omnivorous looper, and tobacco budworm. 

Forests, fruit, ornamental, and nut-bearing trees, as well as shrubs and other nurs- 
ery stock are often susceptible to attack from diverse insects such as bagworm, blackheaded 
budworm, browntail moth, California oakworm, douglas fir tussock moth, elm spanworm, fall 
webworm, fruittree leafroller, greenstriped maple worm, gypsy moth, jack pine budworm, 

30 mimosa webworm, pine butterfly, redhumped caterpillar, saddleback caterpillar, saddle 
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prominent caterpillar, spring and fall cankerworm, spruce budworm, tent caterpillar, tortrix, 
and western tussock moth. Likewise, turf grasses are often attacked by pests such as army- 
worm, sod webworm, and tropical sod web worm. 

Because crops of commercial interest are often the target of insect attack, envi- 
5 ronmentally-sensitive methods for controlling or eradicating insect infestation are desirable in 
many instances. This is particularly true for farmers, nurserymen, growers, and commercial 
and residential areas which seek to control insect populations using eco-friendly composi- 
tions. 

The most widely used environmentally-sensitive insecticidal formulations devel- 
1 0 oped in recent years have been composed of microbial pesticides derived from the bacterium 
Bacillus thuringiensis. 5. thuringiensis is a Gram-positive bacterium that produces crystal 
proteins or inclusion bodies which are specifically toxic to certain orders and species of in- 
sects. Many different strains of B. thuringiensis have been shown to produce insecticidal 
crystal proteins. Compositions including B. thuringiensis strains which produce insecticidal 
15 proteins have been commercially-available and used as environmentally-acceptable insecti- 
cides because they are quite toxic to the specific target insect, but are harmless to plants and 
other non-targeted organisms. 

1.2.1 8-Endotoxins 

20 5-endotoxins are used to control a wide range of leaf-eating caterpillars and bee- 

tles, as well as mosquitoes. These proteinaceous parasporal crystals, also referred to as in- 
secticidal crystal proteins, crystal proteins, Bt inclusions, crystaline inclusions, inclusion 
bodies, and Bt toxins, are a large collection of insecticidal proteins produced by B. 
thuringiensis that are toxic upon ingestion by a susceptible insect host.. Over the past decade 

25 research on the structure and function of B. thuringiensis toxins has covered all of the major 
toxin categories, and while these toxins differ in specific structure and function, general 
similarities in the structure and function are assumed. Based on the accumulated knowledge 
of B. thuringiensis toxins, a generalized mode of action for B. thuringiensis toxins has been 
created and includes: ingestion by the insect, solubilization in the insect midgut (a combina- 

30 tion stomach and small intestine), resistance to digestive enzymes sometimes with partial di- 
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gestion actually "activating" the toxin, binding to the midgut cells, formation of a pore in the 
insect cells and the disruption of cellular homeostasis (English and Slatin, 1992). 

1.2.2 Genes Encoding Crystal Proteins 

5 Many of the 5-endotoxins are related to various degrees by similarities in their 

amino acid sequences. Historically, the proteins and the genes which encode them were 
classified based largely upon their spectrum of insecticidal activity. The review by Hofte and 
Whiteley (1989) discusses the genes and proteins that were identified in B. thuringiensis 
prior to 1990, and sets forth the nomenclature and classification scheme which has tradi- 

10 tionally been applied to B. thuringiensis genes and proteins: cry I genes encode lepidopteran- 
toxic Cryl proteins, cryll genes encode Cryll proteins that are toxic to both lepidopterans 
and dipterans. crylll genes encode coleopteran-toxic Crylll proteins, while crylV genes en- 
code dipteran-toxic CrylV proteins, etc. Based on the degree of sequence similarity, the 
proteins were further classified into subfamilies; more highly related proteins within each 

1 5 family were assigned divisional letters such as Cryl A, CrylB, CrylC, etc. Even more closely 
related proteins within each division were given names such as CrylCl, CryIC2, etc. 

Recently a new nomenclature was developed which systematically classifies the 
Cry proteins based upon amino acid sequence homology rather than upon insect target speci- 
ficities. This classification scheme, including most of the known toxins but not including 

20 allelic variations in individual polypeptides, is summarized in Table 1 . 

Table 1 

Known B. thuringiensis 5-Endotoxins, Genbank Accession Numbers, 
and Revised Nomenclature* 
New Old GenBank Accession # 

CrylAal CrylA(a) M11250 

CrylAa2 CrylA(a) Ml 09 17 
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Table 1 (continued) 



New 


Old 


GenBank Accession # 


CrvlAa3 


CrylA(a) 


D00348 


CrylAa4 


CrylA(a) 


X13535 


CrylAa5 


CrylA(a) 


D175182 


CrvlAa6 


CrylA(a) 


U43605 


CrvlAbl 


CrylA(b) 


M13898 


CrvlAb2 


CrylA(b) 


M12661 


CrvlAb3 


CrylA(b) 


M15271 


Crvl Ab4 


CrylA(b) 


D00117 


CrvlAb5 


CrylA(b) 


X04698 


CrvlAb6 


CrylA(b) 


M37263 


CrvlAb7 


CrylA(b) 


X13233 


CrvlAb8 


CrylA(b) 


Ml 6463 


CrvlAb9 


CrylA(b) 


X54939 


CrvlAblO 


CryIA(b) 


A29125 


CrvlAcl 


CrylA(c) 


Ml 1068 


CrylAc2 


CrylA(c) 


M35524 


CrvlAc3 


CrylA(c) 


X54159 


Crv 1 Ac4 


CrylA(c) 


M73249 


CrvlAcS 


CryIA(c) 


M73248 


Crvl Ac6 


CrylA(c) 


U43606 


CrylAc7 


CrylA(c) 


U87793 


CrvlAc8 


CryIA(c) 


U87397 






U89872 


CrylAclO 


CrylA(c) 


AJ002514 


CrylAdl 


CryIA(d) 


M73250 


CrylAel 


CrylA(e) 


M65252 


CrylBal 


CrylB 


X06711 
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Table 1 (continued) 



New 


Old 


GenBank Accession # 


CrylBa2 




X95704 


CrylBbl 


ET5 


L32020 


CrylBcl 


Crylb(c) 


Z46442 


CrylBdl 


CryEl 


U70726 


CrylCal 


CrylC 


X07518 


CrylCa2 


CrylC 


XI 3620 


CrylCa3 


CrylC 


M73251 


CrylCa4 


CrylC 


A27642 


CrylCa5 


CrylC 


X96682 


CrylCa6 


CrylC 


X96683 


CrylCa7 


CrylC 


X96684 


CrylCbl 


CrylC(b) 


M97880 


CrylDal 


CrylD 


X54160 


CrylDbl 


PrtB 


Z22511 


CrylEal 


CrylE 


X53985 


CrylEa2 


CrylE 


X56144 


CrylEa3 


CrylE 


M73252 


CrylEa4 




U94323 


CrylEbl 


CrylE(b) 


M73253 


CrylFal 


CrylF 


M63897 


CrylFa2 


CrylF 


M63897 


CrylFbl 


PrtD 


Z22512 


Cry 1 Gal 


PrtA 


Z22510 


CrylGa2 


CrylM 


Y09326 


CrylGbl 


CryH2 


U70725 


CrylHal 


PrtC 


Z22513 


CrylHbl 




U35780 
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Table 1 (continued) 





Old 


OenRank Accession # 


\siy i icti 


CrvV 


X62R71 




CrvV 
v_^iy v 




v^ry lieu 


v^ry v 




Prvl Tfl4 
\^ry 1 larr 


\-^ry v 




i^ry i laD 


v^ry v 


I U07ZU 


PW1 1M 

v^ry iioi 


\— ry v 


T 107647 


i^ry uai 


Ill *T 


T 1701Q 


Pnfl rui 


PT1 


T HI ^77 


l^ry 1 JVal 




T T7RR01 


i^ryz/\ai 


PrvTT A 


N/H 1 7^8 
IVl.} 1 / jo 




PrvTT A 


1V1Z J / ZJ 






L/OOUoH- 






M7^79d 
IVLZj /Z*f 


rw7 A K7 




A. J lO 


^ryzrvc i 




Y^770 
f\*j / Z JZ 


fn/l Anl 


PrvTTTA 


M77477 
ivlZZH- / z 


i^ry j/\az 


PrvTTTA 


TH7Q78 
juzy / o 


V^ryjrVaJ 




I UU*tZU 


v^ryj/\at 






Prv^ AaS 


PrvTTTA 


M17707 


\^iy.?r\au 


PrvTTTA 


TI10QRS 




CrvTTTR 


X1717^ 

1/1 


Cry3Ba2 


CrylllB 


A07234 


Cry3Bbl 


CryIIIB2 


M89794 


Cry3Bb2 


CryIIIC(b) 


U31633 


Cry3Cal 


CrylllD 


X59797 


Cry4Aal 


CrylVA 


Y00423 
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Table 1 (continued) 



C TT 


Old 


GenBank Accession # 




CrylVA 


D00248 




CrylVB 


X07423 


Prv4Ra? 


CrylVB 


X07082 


i^ry hdoj 


CrvIVB 


M20242 


uryHDd'T 


CrvIVB 

V 1 Jr I ▼ l— ' 


D00247 


v^ry j/\d l 


CrvVA(V> 


L07025 




CrvVAfbi 


L07026 


L^ryjoai 




U 19725 


p n ,< A Q 1 

v_ryo/\a i 


CrvVIA 


L07022 




CrvVIB 


L07024 


v^ry / Aa l 


rrvTTir 

Jf 


M64478 


Prv7 AM 


rrvTTTCb 


U04367 


v-.ryoA.al 


CrvTIIE 


U04364 


PrvRRnl 

v^ry oDai 


CrvIIIG 


U04365 


v^ryov^ai 


PrvTTTF 

V^i Jf iXXi 


U04366 


PrvQAal 


CrvIG 


X58120 


PrvO Aa9 


CrvIG 


X58534 


v^ryyDai 


CrvIX 


X75019 


v^ryy\— ai 


CrvIH 


Z37527 




N141 


D85560 


Crvl OAal 


CrylVC 


Ml 2662 


Crv1 1 Aal 


CrylVD 


M31737 


CryllAaz 


uryi vu 


1V1ZZOUU 


CryllBal 


Jeg80 


X86902 


Cryl2Aal 


CryVB 


L07027 


Cry 13 Aal 


CryVC 


L07023 


CryHAal 


CryVD 


U13955 
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Table 1 (continued) 



New 


Old 


GenBank Accession # 


CrylSAal 


34kDa 


M76442 


Cryl6Aal 


cbm71 


X94146 


Cryl7Aal 


cbm71 


X99478 


Cryl8Aal 


CryBPl 


X99049 


Cryl9Aal 


Jeg65 


Y08920 


Cry20Aal 




U82518 


Cry21Aal 




132932 


Cry22Aal 




134547 


CytlAal 


CytA 


X03182 


CytlAa2 


CytA 


X04338 


CytlAa3 


CytA 


Y00135 


CytlAa4 


CytA 


M35968 


CytlAbl 


CytM 


X98793 


CytlBal 




U37196 


Cyt2Aal 


CytB 


Z14147 


Cyt2Bal 


"CytB" 


U52043 


Cyt2Ba2 


"CytB" 


AF020789 


Cyt2Ba3 


"CytB" 


AF022884 


Cyt2Ba4 


"CytB" 


AF022885 


Cyt2Ba5 


"CytB" 


AF022886 


Cyt2Bbl 




U82519 



a Adapted from: http://epimix.biols.susx.ac.uk/Home/N^^ 



1.2.3 BlOINSECTICIDE POLYPEPTIDE COMPOSITIONS 

5 The utility of bacterial crystal proteins as insecticides was extended beyond Iepi- 

dopterans and dipteran larvae when the first isolation of a coleopteran-toxic B. thuringiensis 
strain was reported (Krieg et ai, 1983; 1984). This strain (described in U. S. Patent 
4,766,203, specifically incorporated herein by reference), designated B. thuringiensis var. 
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tenebrionis, is reported to be toxic to larvae of the coleopteran insects Agelastica alni (blue 
alder leaf beetle) and Leptinotarsa decemlineata (Colorado potato beetle). 

U. S. Patent 5,024, 837 also describes hybrid B. thuringiensis var. kurstaki strains 
which showed activity against lepidopteran insects. U. S. Patent 4,797,279 (corresponding to 
5 EP 0221024) discloses a hybrid B. thuringiensis containing a plasmid from B. thuringiensis 
var. kurstaki encoding a lepidopteran-toxic crystal protein-encoding gene and a plasmid from 
B. thuringiensis tenebrionis encoding a coleopteran-toxic crystal protein-encoding gene. The 
hybrid B. thuringiensis strain produces crystal proteins characteristic of those made by both 
B. thuringiensis kurstaki and B. thuringiensis tenebrionis. U. S. Patent 4,910,016 
10 (corresponding to EP 0303379) discloses a B. thuringiensis isolate identified as B. 
thuringiensis MT 104 which has insecticidal activity against coleopterans and lepidopterans. 

1.2.4 Molecular Genetic Techniques Facilitate Protein Engineering 

The revolution in molecular genetics over the past decade has facilitated a logical 
15 and orderly approach to engineering proteins with improved properties. Site specific and 
random mutagenesis methods, the advent of polymerase chain reaction (PCR™) methodolo- 
gies, and related advances in the field have permitted an extensive collection of tools for 
changing both amino acid sequence, and underlying genetic sequences for a variety of pro- 
teins of commercial, medical, and agricultural interest. 
20 Following the rapid increase in the number and types of crystal proteins which 

have been identified in the past decade, researchers began to theorize about using such tech- 
niques to improve the insecticidal activity of various crystal proteins. In theory, improve- 
ments to 5-endotoxins should be possible using the methods available to protein engineers 
working in the art, and it was logical to assume that it would be possible to isolate improved 
25 variants of the wild-type crystal proteins isolated to date. By strengthening one or more of 
the aforementioned steps in the mode of action of the toxin, improved molecules should pro- 
vide enhanced activity, and therefore, represent a breakthrough in the field. If specific amino 
acid residues on the protein are identified to be responsible for a specific step in the mode of 
action, then these residues can be targeted for mutagenesis to improve performance 

-11- 

A. 1355JS(2WKVOI!DOC) 



1.2,5 Structural Analyses of Crystal Proteins 

The combination of structural analyses of B. thuringiensis toxins followed by an 
investigation of the function of such structures, motifs, and the like has taught that specific 
5 regions of crystal protein endotoxins are, in a general way, responsible for particular func- 
tions. 

Domain 1, for example, from Cry3Bb and Cry 1 Ac has been found to be respon- 
sible for ion channel activity, the initial step in formation of a pore (Walters et al, 1993; Von 
Tersch et al, 1994). Domains 2 and 3 have been found to be responsible for receptor binding 
10 and insecticidal specificity (Aronson et al, 1995; Caramori et al, 1991; Chen et al 1993; de 
Maagd et al, 1996; Ge et a/., 1991; Lee et al, 1992; Lee et al, 1995; Lu et al, 1994; Smed- 
ley and Ellar, 1996; Smith and Ellar, 1994; Rajamohan et al, 1995; Rajamohan et al, 1996; 
Wu and Dean, 1996). Regions in domain 2 and 3 can also impact the ion channel activity of 
some toxins (Chen et al, 1993, Wolfersberger et al, 1996; Von Tersch et al, 1994). 

15 

L3 Deficiencies in the Prior Art 

Unfortunately, while many laboratories have attempted to make mutated crystal 
proteins, few have succeeded in making mutated crystal proteins with improved lepidopteran 
toxicity. In almost all of the examples of genetically-engineered B. thuringiensis toxins in 

20 the literature, the biological activity of the mutated crystal protein is no better than that of the 
wild-type protein, and in many cases, the activity is decreased or destroyed altogether 
(Almond and Dean, 1993; Aronson et al, 1995; Chen et al, 1993, Chen et al, 1995; Ge et 
al, 1991; Kwak et al, 1995; Lu et al, 1994; Rajamohan et al, 1995; Rajamohan et al, 
1996; Smedley and Ellar, 1996; Smith and Ellar, 1994; Wolfersberger et al, 1996; Wu and 

25 Aronson, 1992). 

For a crystal protein having approximately 650 amino acids in the sequence of its 
active toxin, and the possibility of 20 different amino acids at each position in this sequence, 
the likelihood of arbitrarily creating a successful new structure is remote, even if a general 
function to a stretch of 250-300 amino acids can be assigned. Indeed, the above prior art 

30 with respect to crystal protein gene mutagenesis has been concerned primarily with studying 
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the structure and function of the crystal proteins, using mutagenesis to perturb some step in 
the mode of action, rather than with engineering improved toxins. 

Collectively, the limited successes in the art to develop synthetic toxins with im- 
proved insecticidal activity have stifled progress in this area and confounded the search for 
5 improved endotoxins or crystal proteins. Rather than following simple and predictable rules, 
the successful engineering of an improved crystal protein may involve different strategies, 
depending on the crystal protein being improved and the insect pests being targeted. Thus, 
the process is highly empirical. 

Accordingly, traditional recombinant DNA technology is clearly not routine ex- 
10 perimentation for providing improved insecticidal crystal proteins. What are lacking in the 
prior art are rational methods for producing genetically-engineered B. thuringiensis crystal 
proteins that have improved insecticidal activity and, in particular, improved toxicity towards 
a wide range of lepidopteran insect pests. 

1 5 2.0 Summary of the Invention 

The present invention seeks to overcome these and other drawbacks inherent in 
the prior art by providing genetically-engineered modified B. thuringiensis 5-endotoxins 
(Cry*), and in particular modified Cry 3 6-endotoxins (designated Cry3* endotoxins). Also 
provided are nucleic acid sequences comprising one or more genes which encode such 

20 modified proteins. Particularly preferred genes include cry 3* genes such as cry 3 A*, cry3B*, 
and cry3C* genes, particularly cry3B* genes, and more particularly, cry3Bb* genes, that 
encode modified crystal proteins having improved insecticidal activity against target pests. 

Also disclosed are novel methods for constructing synthetic Cry 3* proteins, 
synthetically-modified nucleic acid sequences encoding such proteins, and compositions 

25 arising therefrom. Also provided are synthetic cry3 * expression vectors and various methods 
of using the improved genes and vectors. In a preferred embodiment, the invention discloses 
and claims Cry3B* proteins and cry3B* genes which encode improved insecticidal 
polypeptides. 

In preferred embodiments, channel-forming toxin design methods are disclosed 
30 which have been used to produce a specific set of designed Cry3Bb* toxins with improved 
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biological activity. These improved Cry3Bb* proteins are listed in Table 2 along with their 
respective amino acid changes from wild-type (WT) Cry3Bb, the nucleotide changes present 
in the altered cry3Bb* gene encoding the protein, the fold increase in bioactivity over WT 
Cry3Bb, the structural site of the alteration, and the design method(s) used to create the new 
5 toxins. 

Accordingly, the present invention provides in an overall and general sense, 
mutagenized Cry3 protein-encoding genes and methods of making and using such genes. As 
used herein the term "mutagenized cry3 gene(s)" means one or more cry3 genes that have 
been mutagenized or altered to contain one or more nucleotide sequences which are not pres- 
10 ent in the wild type sequences, and which encode mutant Cry 3 crystal proteins (Cry3*) 
showing improved insecticidal activity. Such mutagenized cry3 genes have been referred to 
in the Specification as cry 3* genes. Exemplary cry 3* genes include cry3A* cry3B* t and 
cry3C* genes. 

Exemplary mutagenized Cry3 protein-encoding genes include cry3B genes. As 

15 used herein the term "mutagenized cry3B gene(s)" means one or more genes that have been 
mutagenized or altered to contain one or more nucleotide sequences which are not present in 
the wild type sequences, and which encode mutant Cry3B crystal proteins (Cry3B*) showing 
improved insecticidal activity. Such genes have been designated cry3B* genes. Exemplary 
cry3B* genes include cry3Ba* and cry3Bb* genes, which encode Cry3Ba* and Cry3Bb* 

20 proteins, respectively. 

Likewise, the present invention provides mutagenized Cry3A protein-encoding 
genes and methods of making and using such genes. As used herein the term "mutagenized 
cry3A gene(s)" means one or more genes that have been mutagenized or altered to contain 
one or more nucleotide sequences which are not present in the wild type sequences, and 

25 which encode mutant Cry3A crystal proteins (Cry3A*) showing improved insecticidal activ- 
ity. Such mutagenized genes have been designated as cry3A * genes. 

In similar fashion, the present invention provides mutagenized Cry3C protein- 
encoding genes and methods of making and using such genes. As used herein the term 
"mutagenized cry3C gene(s)" means one or more genes that have been mutagenized or al- 

30 tered to contain one or more nucleotide sequences which are not present in the wild type se- 
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quences, and which encode mutant Cry3C crystal proteins (Cry3C*) showing improved in- 
secticidal activity. Such mutagenized genes have been designated as cry3C* genes. 

Preferably the novel sequences comprise nucleic acid sequences in which at least 
one, and preferably, more than one, and most preferably, a significant number, of wild-type 
5 cry3 nucleotides have been replaced with one or more nucleotides, or where one or more nu- 
cleotides have been added to or deleted from the native nucleotide sequence for the purpose 
of altering, adding, or deleting the corresponding amino acids encoded by the nucleic acid 
sequence so mutagenized. The desired result, therefore, is alteration of the amino acid se- 
quence of the encoded crystal protein to provide toxins having improved or altered activity 

1 0 and/or specificity compared to that of the unmodified crystal protein. 

Examples of preferred Cry2Bb* -encoding genes include cry3Bb.60, 
cry3BbA1221, cry3BbA1222, cry3BbA1223, cry3Bb.l 1224, cry3Bb.l 1225, cry3Bb.l 1226, 
cry3BbM227, cry3BbA1228 y cry3BbA 1229, cry3BbA 1230, cry3BbA 1231, cry3BbA 1232, 
cry3Bb.U233, cry3Bb.l 1234, cry3BbM235, cry3BbA1236, cry3Bb 11237, cry3BbA1238, 

15 cry3BbA1239, cry3BbA1241, cry3BbA1242, cry3BbA!032, cry3BbA1035, cry3BbA1036, 
cry3BbA1046, cry3BbA1048 t cry3BbA1051, cry3BbA1057, cry3BbA1058, cry3BbA1081, 
cry3BbA1082, cry3BbA1083, cry3BbA1084, cry3BbA1095, and cry 3 Bb A 1098. 
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In a variety of illustrative embodiments, the inventors have shown remarkable 
success in generating toxins with improved insecticidal activity using these methods. In par- 
ticular, the inventors have identified unique methods of analyzing and designing toxins hav- 
ing improved or enhanced insecticidal properties both in vitro and in vivo. 
5 In addition to modifications of Cry3Bb peptides, those having benefit of the present 

teaching are now also able to make mutations in a variety of channel-forming toxins, and 
particularly in crystal proteins which are related to Cry3Bb either functionally or structurally. 
In fact, the inventors contemplate that any B. thuringiensis crystal protein or peptide can be 
analyzed using the methods disclosed herein and may be altered using the methods disclosed 

10 herein to produce crystal proteins having improved insecticidal specificity or activity. 
Alternatively, the inventors contemplate that those of skill in the art having the benefit of the 
teachings disclosed herein will be able to prepare not only mutated Cry3 toxins with improved 
activity, but also other crystal proteins including all of those proteins identified in Table 1, 
herein. In particular, the inventors contemplate the creation of Cry3* variants using one or 

15 more of the methods disclosed herein to produce toxins with improved activity. For example, 
the inventors note Cry3 A, Cry3B, and Cry3C crystal proteins (which are known in the art) may 
be modified using one or more of the design strategies employed herein, to prepare 
synthetically-modified crystal proteins with improved properties. Likewise, one of skill in the 
art will even be able to utilize the teachings of the present disclosure to modify other channel 

20 forming toxins, including channel forming toxins other than 5. thuringiensis crystal proteins, 
and even to modify proteins and channel toxins not yet described or characterized. 

Because the structures for insecticidal crystal proteins show a remarkable conser- 
vation of protein tertiary structure (Grochulski et al, 1995), and because many crystal pro- 
teins show significant amino acid sequence identity to the Cry3Bb amino acid sequence 

25 within domain 1, including proteins of the Cryl, Cry2, Cry3, Cry4, Cry5, Cry7, Cry8, Cry9, 
Cry 10, Cry 1 1, Cry 12, Cry 13, Cry 14, and Cry 16 classes (Table 1), now in light of the inven- 
tors* surprising discovery, for the first time, those of skill in the art having benefit of the 
teachings disclosed herein will be able to broadly apply the methods of the invention to 
modifying a host of crystal proteins with improved activity or altered specificity. Such 

30 methods will not only be limited to the insecticidal crystal proteins disclosed in Table 1, but 
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may also been applied to any other related crystal protein, including those yet to be identi- 
fied. 

In particular, the high degree of homology between Cry3A, Cry3B, and Cry3C 
proteins is evident in the alignment of the primary amino acid sequence of the three proteins 
5 (FIG. 17 A, FIG. 17B, and FIG. 17C). 

As such, the disclosed methods may be now applied to preparation of modified 
crystal proteins having one or more alterations introduced using one or more of the muta- 
tional design methods as disclosed herein. The inventors further contemplate that regions 
may be identified in one or more domains of a crystal protein, or other channel forming toxin 
10 which may be similarly modified through site-specific or random mutagenesis to generate 
toxins having improved activity, or alternatively, altered specificity. 

In certain applications, the creation of altered toxins having increased activity 
against one or more insects is desired. Alternatively, it may be desirable to utilize the meth- 
ods described herein for creating and identifying altered insecticidal crystal proteins which 
15 are active against a wider spectrum of susceptible insects. The inventors further contemplate 
that the creation of chimeric insecticidal crystal proteins comprising one or more of these 
mutations may be desirable for preparing "super" toxins which have the combined advan- 
tages of increased insecticidal activity and concomitant broad spectrum activity. 

In light of the present disclosure, the mutagenesis of one or more codons within 
20 the sequence of a toxin may result in the generation of a host of related insecticidal proteins 
having improved activity. While exemplary mutations have been described for each of the 
design strategies employed in the present invention, the inventors contemplate that mutations 
may also be made in insecticidal crystal proteins, including the loop regions, helices regions, 
active sites of the toxins, regions involved in protein oligomerization, and the like, which will 
25 give rise to functional bioinsecticidal crystal proteins. All such mutations are considered to 
fall within the scope of this disclosure. 

In one illustrative embodiment, mutagenized cry3Bb* genes are obtained which 
encode Cry3Bb* variants that are generally based upon the wild-type Cry3Bb sequence, but 
that have one or more changes incorporated into the amino acid sequence of the protein using 
30 one or more of the design strategies described and claimed herein. 
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In these and other embodiments, the mutated genes encoding the crystal proteins 
may be modified so as to change about one, two, three, four, or five or so amino acids in the 
primary sequence of the encoded polypeptide. Alternatively even more changes from the 
native sequence may be introduced, such that the encoded protein may have at least about 1 % 
5 or 2%, or alternatively about 3% or about 4%, or even about 5% to about 10%, or about 10% 
to about 1 5%, or even about 1 5% to about 20% or more of the codons either altered, deleted, 
or otherwise modified. In certain situations, it may even be desirable to alter substantially 
more of the primary amino acid sequence to obtain the desired modified protein. In such 
cases the inventors contemplate that from about 25%, to about 50%, or even from about 50% 

10 to about 75%, or more of the native (or wild-type) codons either altered, deleted, or otherwise 
modified. Alternatively, mutations in the amino acid sequences or underlying DNA gene 
sequences which result in the insertion or deletion of one or more amino acids within one or 
more regions of the crystal protein or peptide. 

To effect such changes in the primary sequence of the encoded polypeptides, it 

15 may be desirable to mutate or delete one or more nucleotides from the nucleic acid sequences 
of the genes encoding such polypeptides, or alternatively, under certain circumstances to add 
one or more nucleotides into the primary nucleic acid sequence at one or more sites in the 
sequence. Frequently, several nucleotide residues may be altered to produce the desired 
polypeptide. As such, the inventors contemplate that in certain embodiments it may be 

20 desirable to alter only one, two, three, four, or five or so nucleotides in the primary sequence. 
In other embodiments, which more changes are desired, the mutagenesis may involve 
changing, deleting, or inserting 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, or even 20 or 
so nucleotide residues in the gene sequence. In still other embodiments, one may desire to 
mutate, delete, or insert 21, 22, 23, 24, 25, 26, 27, 28, 29, 30-40, 40-50, 50-60, 60-70, 70-80, 

25 80-90, or even 90-100, 150, 200, 250, 300, 350, 400, 450, or more nucleotides in the 
sequence of the gene in order to prepare a cry 3* gene which produces a Cry3* polypeptide 
having the desired characteristics. In fact, any number of mutations, deletions, and/or 
insertions may be made in the primary sequence of the gene, so long as the encoded protein 
has the improved insecticidal activity or specificity characteristics described herein. 
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Changing a large number of the codons in the nucleotide sequence of an 
endotoxin-encoding gene may be particularly desirable and often necessary to achieve the 
desired results, particularly in the situation of "plantizing" a DNA sequence in order to 
express a DNA of non-plant origin in a transformed plant cell. Such methods are routine to 
5 those of skill in the plant genetics arts, and frequently many residues of a primary gene 
sequence will be altered to facilitate expression of the gene in the plant cell. Preferably, the 
changes in the gene sequence introduce no changes in the amino acid sequence, or introduce 
only conservative replacements in the amino acid sequence such that the polypeptide 
produced in the plant cell from the "plantized" nucleotide sequence is still fully functional, 

1 0 and has the desired qualities when expressed in the plant cell. 

Genes and encoded proteins mutated in the manner of the invention may also be 
operatively linked to other protein-encoding nucleic acid sequences, or expressed as fusion 
proteins. Both N-terminal and C-terminal fusion proteins are contemplated. Virtually any 
protein- or pep tide-encoding DNA sequence, or combinations thereof, may be fused to a 

15 mutated cry3* sequence in order to encode a fusion protein. This includes DNA sequences 
that encode targeting peptides, proteins for recombinant expression, proteins to which one or 
more targeting peptides is attached, protein subunits, domains from one or more crystal 
proteins, and the like. Such modifications to primary nucleotide sequences to enhance, 
target, or optimize expression of the gene sequence in a particular host cell, tissue, or cellular 

20 localization, are well-known to those of skill in the art of protein engineering and molecular 
biology, and it will be readily apparent to such artisans, having benefit of the teachings of 
this specification, how to facilitate such changes in the nucleotide sequence to produce the 
polypeptides and polynucleotides disclosed herein. 

In one aspect, the invention discloses and claims host cells comprising one or 

25 more of the modified crystal proteins disclosed herein, and in particular, cells of B. 
thuringiensis strains EG11221, EG11222, EG11223, EG11224, EG11225, EG11226, 
EG11227, EG11228, EG11229, EG11230, EG11231, EG11232, EG11233, EG11234, 
EG11235, EG11236, EG11237, EG11238, EG11239, EG11241, EG11242, EG11032, 
EG11035, EG11036, EG11046, EG11048, EG11051, EG11057, EG11058, EG11081, 

30 EG 1 1082, EG1 1083, EG1 1084, EG1 1095, and EG1 1098 which comprise recombinant DNA 
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segments encoding synthetically-modified Cry3Bb* crystal proteins which demonstrates im- 
proved insecticidal activity. 

Likewise, the invention also discloses and claims cell cultures of B. thuringiensis 
EG11221, EG11222, EG11223, EG11224, EG11225, EG11226, EG11227, EGU228, 
5 EG11229, EG11230, EG11231, EG11232, EG11233, EG11234, EG11235, EG11236, 
EG11237, EG11238, EG11239, EG11241, EG11242, EG11032, EG11035, EG11036, 
EG11046, EG11048, EG11051, EG11057, EG11058, EGI1081, EG11082, EG11083, 
EG1 1084, and EG 1 1095, and 1 1098. 

Such cell cultures may be biologically-pure cultures consisting of a single strain, 

10 or alternatively may be cell co-cultures consisting of one or more strains. Such cell cultures 
may be cultivated under conditions in which one or more additional B. thuringiensis or other 
bacterial strains are simultaneously co-cultured with one or more of the disclosed cultures, or 
alternatively, one or more of the cell cultures of the present invention may be combined with 
one or more additional B. thuringiensis or other bacterial strains following the independent 

15 culture of each. Such procedures may be useful when suspensions of cells containing two or 
more different crystal proteins are desired. 

The subject cultures have been deposited under conditions that assure that access 
to the cultures will be available during the pendency of this patent application to one deter- 
mined by the Commissioner of Patents and Trademarks to be entitled thereto under 37 C.F.R. 

20 §1.14 and 35 U.S.C. §122. The deposits are available as required by foreign patent laws in 
countries wherein counterparts of the subject application, or its progeny, are filed. However, 
it should be understood that the availability of a deposit does not constitute a license to prac- 
tice the subject invention in derogation of patent rights granted by governmental action. 

Further, the subject culture deposits will be stored and made available to the pub- 

25 lie in accord with the provisions of the Budapest Treaty for the Deposit of Microorganisms, 
i.e., they will be stored with all the care necessary to keep them viable and uncontaminated 
for a period of at least five years after the most recent request for the finishing of a sample of 
the deposit, and in any case, for a period of at least 30 (thirty) years after the date of deposit 
or for the enforceable life of any patent which may issue disclosing the cultures. The deposi- 

30 tor acknowledges the duty to replace the deposits should the depository be unable to furnish a 
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sample when requested, due to the condition of the deposits. All restrictions on the avail- 
ability to the public of the subject culture deposits will be irrevocably removed upon the 
granting of a patent disclosing them. 

Cultures shown in Table 3 were deposited in the permanent collection of the Ag- 
5 ricultural Research Service Culture Collection, Northern Regional Research Laboratory 
(NRRL) under the terms of the Budapest Treaty. 
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Table 3 

Strains of the Present Invention Deposited Under the Terms 
of the Budapest Treaty 
Strain Deposit Date Protein Accession Number 

(NRRL Number) 



EG11032 


5/27/97 


Cry3Bb. 11032 


B-21744 


EGU035 


5/27/97 


Cry3Bb. 11035 


B-21745 


EG 11036 


5/27/97 


Cry3Bb. 11036 


B-21746 


EG11037 


5/27/97 


Cry3Bb. 11037 


B-21747 


EG11046 


5/27/97 


Cry3Bb. 11046 


B-21748 


EG 11048 


5/27/97 


Cry3Bb. 11048 


B-21749 


EG11051 


5/27/97 


Cry3Bb. 11051 


B-21750 


EG11057 


5/27/97 


Cry3Bb.ll057 


B-21751 


EG11058 


5/27/97 


Cry3Bb.ll058 


B-21752 


EG11081 


5/27/97 


Cry3Bb.ll081 


B-21753 


EG11082 


5/27/97 


Cry3Bb.ll082 


B-21754 


EG11083 


5/27/97 


Cry3Bb. 11083 


B-21755 


EG 11 084 


5/27/97 


Cry3Bb. 11084 


B-21756 


EG 11 095 


5/27/97 


Cry3Bb. 11095 


•B-21757 


EG11204 


5/27/97 


Cry3Bb. 11204 


B-21758 


EG11221 


5/27/97 


Cry3Bb. 11221 


B-21759 


EG11222 


5/27/97 


Cry3Bb.U222 


B-21760 


EG11223 


5/27/97 


Cry3Bb. 11223 


B-21761 


EG11224 


5/27/97 


Cry3Bb. 11224 


B-21762 


EG11225 


5/27/97 


Cry3Bb. 11225 


B-21763 


EG11226 


5/27/97 


Cry3Bb. 11226 


B-21764 


EG11227 


5/27/97 


Cry3Bb. 11227 


B-12765 


EG11228 


5/27/97 


Cry3Bb. 11228 


B- 1 2766 


EG11229 


5/27/97 


Cry3Bb. 11229 


B-21767 
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Strain 


Deposit Date 


Protein 


Accession Number 
(NRRL Number) 


EG11230 


5/27/97 


Cry3Bb.ll230 


B-21768 


EG11231 


5/27/97 


Cry3Bb. 11231 


B-21769 


EG11232 


5/27/97 


Cry3Bb. 11232 


B- 12770 


EG11233 


5/27/97 


Cry3Bb. 11233 


B-21771 


EG 11234 


5/27/97 


Cry3Bb.ll234 


B-21772 


EG11235 


5/27/97 


Cry3Bb. 11235 


B-21773 


EG11236 


5/27/97 


Cry3Bb. 11236 


B-21774 


EG11237 


5/27/97 


Cry3Bb. 11237 


B-21775 




5/27/97 


(_ryjt>D.l lzja 


D-Z 1 / /o 


EG11239 


5/27/97 


Cry3Bb. 11239 


B-21777 


EG11241 


5/27/97 


Cry3Bb. 11241 


B-21778 


EG11242 


5/27/97 


Cry3Bb.ll242 


B-21779 



10 



15 



Also disclosed are methods of controlling or eradicating an insect population from 
an environment. Such methods generally comprise contacting the insect population to be 
controlled or eradicated with an insecticidally-effective amount of a Cry3* crystal protein 
composition. Preferred Cry3* compositions include Cry3A*, Cry3B*, and Cry3C* 
polypeptide compositions, with Cry3B* compositions being particularly preferred. Examples 
of such polypeptides include proteins selected from the group consisting of Cry3Bb-60, 
Cry3Bb.ll221, Cry3Bb.ll222, Cry3Bb.ll223, Cry3Bb.ll224, Cry3Bb.ll225, 

Cry3Bb. 11228, 
Cry3Bb. 11233, 
Cry3Bb. 11238, 
Cry3Bb. 11035, 
Cry3Bb. 11057, 

Cry3Bb.l 1082, Cry3Bb.l 1083, Cry3Bb.l 1084, Cry3Bb.l 1095, and Cry3Bb.l 1098. 

In preferred embodiments, these Cry3Bb* crystal protein compositions comprise 
the amino acid sequence of any of SEQ ID NO:2, SEQ ID NO:4, SEQ ID NO:6. SEQ ID 



Cry3Bb. 11226, 
Cry3Bb. 11231, 
Cry3Bb.U236, 
Cry3Bb. 11242, 
Cry3Bb. 11048, 



Cry3Bb. 11227, 
Cry3Bb.ll232, 
Cry3Bb. 11237, 
Cry3Bb.ll032, 
Cry3Bb.ll051, 



Cry3Bb.ll229, 
Cry3Bb.ll234, 
Cry3Bb. 11239, 
Cry3Bb. 11036, 
Cry3Bb.ll058, 



Cry3Bb. 11230, 
Cry3Bb. 11235, 
Cry3Bb.ll241, 
Cry3Bb. 11046, 
Cry3Bb. 11081, 
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N0:8, SEQ ID NO: 1 0, SEQ ID NO: 1 2, SEQ ID NO: 1 4. SEQ ID NO: 1 6, SEQ ID NO: 1 8, SEQ 
ID NO:20, SEQ ID NO:22, SEQ ID NO:24, SEQ ID NO:26, SEQ ID NO:28, SEQ ID NO:30, 
SEQ ID NO:32, SEQ ID NO:34, SEQ ID NO:36, SEQ ID NO:38, SEQ ID NO:40, SEQ ID 
NO:42, SEQ ID NO:44, SEQ ID NO:46, SEQ ID NO:48, SEQ ID NO:50, SEQ ID NO:52, 
5 SEQ ID NO:54, SEQ ID NO:56, SEQ ID NO:58, SEQ ID NO:60, SEQ ID NO:62, SEQ ID 
NO:64, SEQ ID NO:66, SEQ ID NO:68, SEQ ID NO: 70, SEQ ID NO: 1 00, SEQ ID NO: 102 or 
SEQ ID NO: 108. 

2.1 Methods for Producing Modified Cry* Proteins 

10 The modified Cry* polypeptides of the present invention are preparable by a 

process which generally involves the steps of obtaining a nucleic acid sequence encoding a 
Cry* polypeptide; analyzing the structure of the polypeptide to identify particular "target" 
sites for mutagenesis of the underlying gene sequence; introducing one or more mutations 
into the nucleic acid sequence to produce a change in one or more amino acid residues in the 

15 encoded polypeptide sequence; and expressing in a transformed host cell the mutagenized 
nucleic acid sequence under conditions effective to obtain the modified Cry* protein encoded 
by the cry* gene. 

Means for obtaining the crystal structures of the polypeptides of the invention are 
well-known. Exemplary high resolution crystal structure solution sets are given in Section 

20 9.0 of the disclosure, and include the crystal structure of both the Cry3A and Cry3B 
polypeptides disclosed herein. The information provided in Section 9.0 permits the analyses 
disclosed in each of the methods herein which rely on the 3D crystal structure information for 
targeting mutagenesis of the polypeptides to particular regions of the primary amino acid se- 
quences of the 8-endotoxins to obtain mutants with increased insecticidal activity or en- 

25 hanced insecticidal specificity. 

A first method for producing a modified 5. thuringiensis Cry3Bb 5-endo toxin 
having improved insecticidal activity or specificity disclosed herein generally involves ob- 
taining a high-resolution 3D crystal structure of the endotoxin, locating in the crystal struc- 
ture one or more regions of bound water wherein the bound water forms a contiguous hy- 

30 drated surfaces separated by no more than about 16 A; increasing the number of water mole- 
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cules in this surface by increasing the hydrophobicity of one or more amino acids of the pro- 
tein in the region; and obtaining the modified 5-endotoxin so produced. Exemplary 5- 
endotoxins include Cry3Bb.l 1032, Cry3Bb.l 1227, Cry3Bb.l 1241, Cry3Bb.l 1051, 
Cry3Bb.l 1242, and Cry3Bb.l 1098. 
5 A second method for producing a modified B. thuringiensis Cry3Bb 5-endotoxin 

having improved insecticidal activity comprises identifying a loop region in a 5-endotoxin; 
modifying one or more amino acids in the loop to increase the hydrophobicity of the amino 
acids; and obtaining the modified 5-endotoxin so produced. Preferred 5-endotoxinproduced 
by this method include Cry3Bb.l 1241, Cry3Bb.l 1242, Cry3Bb.l 1228, Cry3Bb.l 1229, 

1 0 Cry3Bb. 1 1 230, Cry3Bb. 11231, Cry3Bb. 1 1 233, Cry3Bb. 1 1 236, Cry3Bb. 11237, 
Cry3Bb.l 1238, and Cry3Bb.l 1239. 

A method for increasing the mobility of channel forming helices of a B. 
thuringiensis Cry3B 5-endotoxin is also provided by the present invention. The method gen- 
erally comprises disrupting one or more hydrogen bonds formed between a first amino acid 

1 5 of one or more of the channel forming helices and a second amino acid of the 5-endotoxin. 
The hydrogen bonds may be formed inter- or intramolecularly, and the disrupting may con- 
sist of replacing a first or second amino acid with a third amino acid whose spatial distance is 
greater than about 3 A, or whose spatial orientation bond angle is not equal to 1 80±60 de- 
grees relative to the hydrogen bonding site of the first or second amino acid. 5-endotoxins 

20 produced by this method and disclosed herein include Cry3Bb. 11222, Cry3Bb.l 1223, 
Cry3Bb. 1 1 224, Cry3Bb. 11225, Cry3 Bb. 1 1 226, Cry 3 Bb. 1 1 227, Cry 3Bb. 11231, 
Cry3Bb.ll241, and Cry3Bb.ll242, and Cry3Bb. 11098. 

Also disclosed is a method of increasing the flexibility of a loop region in a 
channel forming domain of a B. thuringiensis Cry3Bb 5-endotoxin. This method comprises 

25 obtaining a crystal structure of a Cry3Bb 5-endotoxin having one or more loop regions; iden- 
tifying the amino acids comprising the loop region; and altering one or more of the amino 
acids to reduce steric hindrance in the loop region, wherein the altering increases flexibility 
of the loop region in the 6-endotoxin. Examples of 5-endotoxins produced using this method 
include Cry3Bb.ll032, Cry3Bb.ll051, Cry3Bb.ll228, Cry3Bb.ll229, Cry3Bb.l 1230, 

30 Cry3Bb.l 123 1, Cry3Bb.l 1232, Cry3Bb.l 1233, Cry3Bb.l 1236, Cry3Bb.l 1237, 
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Cry3Bb.ll238, Cry3Bb.l 1239, . Cry3Bb.l 1227, Cry3Bb.l 1234, Cry3Bb.l 1241, 
Cry3Bb.l 1242, Cry3Bb.l 1036, and Cry3Bb.l 1098. 

Another aspect of the invention is a method for increasing the activity of a 5- 
endotoxin, comprising reducing or eliminating binding of the 5-endotoxin to a carbohydrate 
5 in a target insect gut. The eliminating or reducing may be accomplished by removal of one 
or more a helices of domain 1 of the 5-endotoxin, for example, by removal of a helices al, 
a2a/b, and ct3. An exemplary 5-endotoxin produced using the method is Cry3Bb.60. 

Alternatively, the reducing or eliminating may be accomplished by replacing one 
or more amino acids within loop pi,a8, with one or more amino acids having increased hy- 
10 drophobicity. Such a method gives rise to 5-endotoxins such as Cry3Bb.l 1228, 
Cry3Bb.ll230, Cry3B.11231, Cry3Bb.l 1237, and Cry3Bb.l 1098, which are described in 
detail, herein. 

Alternatively, the reducing or eliminating is accomplished by replacing one or 
more specific amino acids, with any other amino acid. Such replacements are described in 
15 Table 2, and in the examples herein. One example is the 5-endotoxin designated herein as 
Cry3Bb.ll221. 

A method of identifying a region of a Cry3Bb 5-endotoxin for targeted 
mutagenesis comprising: obtaining a crystal structure of the 5-endotoxin; identifying from 
the crystal structure one or more surface-exposed amino acids in the protein; randomly sub- 

20 stituting one or more of the surface-exposed amino acids to obtain a plurality of mutated 
polypeptides, wherein at least 50% of the mutated polypeptides have diminished insecticidal 
activity; and identifying from the plurality of mutated polypeptides one or more regions of 
the Cry3Bb 5-endotoxin for targeted mutagenesis. The method may further comprise deter- 
mining the amino acid sequences of a plurality of mutated polypeptides having diminished 

25 activity, and identifying one or more amino acid residues required for insecticidal activity. 

In another embodiment, the invention provides a process for producing a Cry3Bb 
5-endotoxin having improved insecticidal activity. The process generally involves the steps 
of obtaining a high-resolution crystal structure of the protein; determining the electrostatic 
surface distribution of the protein; identifying one or more regions of high electrostatic di- 

30 versity; modifying the electrostatic diversity of the region by altering one or more amino ac- 
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ids in the region; and obtaining a Cry3Bb 5-endotoxin which has improved insecticidal ac- 
tivity. In one embodiment, the electrostatic diversity may be decreased relative to the elec- 
trostatic diversity of a native Cry3Bb 5-endotoxin. Exemplary S-endotoxins with decreased 
electrostatic diversity include Cry3Bb.l 1227, Cry3Bb.l 1241, and Cry3Bb.l 1242. Alterna- 
5 tively, the electrostatic diversity may be increased relative to the electrostatic diversity of a 
native Cry3Bb 5-endotoxin. An exemplary 5-endotoxin with increased electrostatic diversity 
is Cry3Bb. 11234. 

Furthermore, the invention also provides a method of producing a Cry3Bb 5- 
endotoxin having improved insecticidal activity which involves obtaining a high-resolution 

10 crystal structure; identifying the presence of one or more metal binding sites in the protein; 
altering one or more amino acids in the binding site; and obtaining an altered protein, 
wherein the protein has improved insecticidal activity. The altering may involve the elimi- 
nation of one or more metal binding sites. Exemplary 5-endotoxin include Cry3Bb.l 1222, 
Cry3Bb. 1 1 224, Cry3Bb. 1 1225, and Cry3Bb. 1 1 226. 

15 A further aspect of the invention involves a method of identifying a B. 

thuringiensis Cry3Bb '5-endotoxin having improved channel activity. This method in an 
overall sense involves obtaining a Cry3Bb 5-endotoxin suspected of having improved chan- 
nel activity; and determining one or more of the following characteristics in the 6-endotoxin, 
and comparing such characteristics to those obtained for the wild-type unmodified 5- 

20 endotoxin: (1) the rate of channel formation, (2) the rate of growth of channel conductance 
or (3) the duration of open channel state. From this comparison, one may then select a 5- 
endotoxin which has an increased rate of channel formation compared to the wildtype 5- 
endotoxin. Examples of Cry3Bb 5-endotoxins prepared by this method include Cry3Bb.60, 
Cry3Bb.l 1035, Cry3Bb.i 1048, Cry3Bb.l 1032, Cry3Bb.l 1223, Cry3Bb.l 1224, 

25 Cry3Bb.ll226, Cry3Bb.ll221, Cry3Bb.ll242, Cry3Bb.ll230, and Cry3Bb. 11098. 

Also provided is a method for producing a modified Cry3Bb 5-endotoxin, having 
improved insecticidal activity which involves altering one or more non-surface amino acids 
located at or near the point of greatest convergence of two or more loop regions of the 
Cry3Bb 5-endotoxin, such that the altering decreases the mobility of one or more of the loop 

30 regions. The mobility may conveniently be determined by comparing the thermal denatura- 
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tion of the modified protein to a wild-type Cry3Bb 5-endotoxin. An exemplary crystal pro- 
tein produced by this method is Cry3Bb.l 1095. 

A further aspect of the invention involves a method for preparing a modified 
Cry3Bb 5-endotoxin, having improved insecticidal activity comprising modifying one or 
more amino acids in the loop to increase the hydrophobicity of said amino acids; and altering 
one or more of said amino acids to reduce steric hindrance in the loop region, wherein the 
altering increases flexibility of the loop region in the endotoxin. Exemplary Cry3Bb 5- 
endotoxins produced is selected from the group consisting of Cry3Bb.l 1057, Cry3Bb.l 1058, 
Cry3Bb,11081, Cry3Bb.ll082, Cry3Bb.ll083, Cry3Bb.ll084, Cry3Bb.ll231, 
Cry3Bb.l 1235, and Cry3Bb.l 1098. 

The invention also provides a method of improving the insecticidal activity of a B. 
thuringiensis Cry3Bb 5-endotoxin, which generally comprises inserting one or more protease 
sensitive sites into one or more loop regions of domain 1 of the 8-endotoxin. Preferably, the 
loop region is a3,4, and an exemplary 5-endotoxin so produced is Cry3Bb. 1 1221 . 

2.2 Polypeptide Compositions 

The crystal proteins so produced by each of the methods described herein also 
represent important aspects of the invention. Such crystal proteins preferably include a pro- 
tein or peptide selected from the group consisting of Cry3Bb-60, Cry3Bb.l 1221, 



Cry3Bb. 11222, 
Cry3Bb.ll227, 
Cry3Bb. 11232, 
Cry3Bb.ll237, 
Cry3Bb. 11032, 
Cry3Bb.ll051, 



Cry3Bb.ll223, 
Cry3Bb. 11228, 
Cry3Bb.ll233, 
Cry3Bb. 11238, 
Cry3Bb.ll035, 
Cry3Bb. 11057, 



30 



Cry3Bb. 11224, 
Cry3Bb. 11229, 
Cry3Bb. 11234, 
Cry3Bb. 11239, 
Cry3Bb. 11036, 
Cry3Bb. 11058, 

Cry3Bb.ll083, Cry3Bb.ll084, Cry3Bb.ll095, and Cry3Bb.l 1098. 

In preferred embodiments, the protein comprises a contiguous amino acid se- 
quence selected from the group consisting of SEQ ID NO:2, SEQ ID NO:4, SEQ ID NO:6. 
SEQ ID NO:8, SEQ ID NO: 10, SEQ ID NO: 12, SEQ ID NO: 14. SEQ ID NO: 16, SEQ ID 
NO: 18, SEQ ID NO:20, SEQ ID NO:22, SEQ ID NO:24, SEQ ID NO:26, SEQ ID NO:28, 



Cry3Bb. 11225, 
Cry3Bb. 11230, 
Cry3Bb. 11235, 
Cry3Bb. 11241, 
Cry3Bb.ll046, 
Cry3Bb.ll081, 



Cry3Bb. 11226, 
Cry3Bb.ll231, 
Cry3Bb. 11236, 
Cry3Bb. 11242, 
Cry3Bb. 11048, 
Cry3Bb. 11082, 
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SEQ ID NO:30, SEQ ID NO:32, SEQ ID NO:34, SEQ ID NO:36, SEQ ID NO:38, SEQ ID 
NO:40, SEQ ID NO:42, SEQ ID NO:44, SEQ ID NO:46, SEQ ID NO:48, SEQ ID NO:50, 
SEQ ID NO:52, SEQ ID NO:54, SEQ ID NO:56, SEQ ID NO:58, SEQ ID NO:60, SEQ ID 
NO:62, SEQ ID NO:64, SEQ ID NO:66, SEQ ID NO:68, SEQ ID NO:70, SEQ ID NO: 100, 
5 SEQ ID NO: 102, and SEQ ID NO: 108. 

Highly preferred are those crystal proteins which are encoded by the nucleic acid 
sequence of SEQ ID NO: 1, SEQ ID NO:3, SEQ ID NO:5. SEQ ID NO:7, SEQ ID NO:9, SEQ 
ID NO:l 1, SEQ ID NO: 13. SEQ ID NO: 15, SEQ ID NO: 17, SEQ ID NO: 19, SEQ ID NO:21, 
SEQ ID NO:23, SEQ ID NO:25, SEQ ID NO:27, SEQ ID NO:29, SEQ ID NO:31, SEQ ID 

10 NO:33, SEQ ID NO:35, SEQ ID NO:37, SEQ ID NO:39, SEQ ID NO:41, SEQ ID NO:43, 
SEQ ID NO:45, SEQ ID NO:47, SEQ ID NO:49, SEQ ID NO:51, SEQ ID NO:53, SEQ ID 
NO:55, SEQ ID NO:57, SEQ ID NO:59, SEQ ID NO:61, SEQ ID NO:63, SEQ ID NO:65, 
SEQ ID NO:67, SEQ ID NO:69, SEQ ID NO:99, SEQ ID NO: 101; or SEQ ID NO: 107, or a 
nucleic acid sequence which hybridizes to the nucleic acid sequence of SEQ ID NO.l, SEQ 

15 ID NO:3, SEQ ID NO:5. SEQ ID NO:7, SEQ ID NO:9, SEQ ID NO. l 1, SEQ ID NO: 13. SEQ 
ID NO: 15, SEQ ID NO: 17, SEQ ID NO: 19, SEQ ID NO:21, SEQ ID NO:23, SEQ ID NO:25, 
SEQ ID NO:27, SEQ ID NO:29, SEQ ID NO:31, SEQ ID NO:33, SEQ ID NO:35, SEQ ID 
NO:37, SEQ ID NO:39, SEQ ID NO:41, SEQ ID NO:43, SEQ ID NO:45, SEQ ID NO:47, 
SEQ ID NO:49, SEQ ID NO:51, SEQ ID NO:53, SEQ ID NO:55, SEQ ID NO:57, SEQ ID 

20 NO:59, SEQ ID NO:61, SEQ ID NO:63, SEQ ID NO:65, SEQ ID NO:67, SEQ ID NO:69, 
SEQ ID NO:99, SEQ ID NO: 101, or SEQ ID NO: 107 under conditions of moderate strin- 
gency. 

Amino acid, peptide and protein sequences within the scope of the present inven- 
tion include, and are not limited to the sequences set forth in SEQ ID NO:2, SEQ ID NO:4, 

25 SEQ ID NO:6, SEQ ID NO:8, SEQ ID NO:10, SEQ ID NO: 12, SEQ ID NO: 14, SEQ ID 
NO: 16, SEQ ID NO: 18, SEQ ID NO:20, SEQ ID NO:22 SEQ ID NO:24, SEQ ID NO:26, 
SEQ ID NO:28, SEQ ID NO:30, SEQ ID NO:32, SEQ ID NO:34, SEQ ID NO:36, SEQ ID 
NO:38, SEQ ID NO:40, SEQ ID NO:42, SEQ ID NO:44, SEQ ID NO:46 SEQ ID NO:48, 
SEQ ID NO:50, SEQ ID NO:52, SEQ ID NO:54, SEQ ID NO:56, SEQ ID NO:58, SEQ ID 

30 NO:60, SEQ ID NO:62, SEQ ID NO:64, SEQ ID NO:66, SEQ ID NO:68, SEQ ID NO:70. 
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SEQ ID NO:100, SEQ ID NO:102, and SEQ ID NO:108, and alterations in the amino acid 
sequences including alterations, deletions, mutations, and homologs. 

Compositions which comprise from about 0.5% to about 99% by weight of the 
crystal protein, or more preferably from about 5% to about 75%, or from about 25% to about 
5 50% by weight of the crystal protein are provided herein. Such compositions may readily be 
prepared using techniques of protein production and purification well-known to those of skill, 
and the methods disclosed herein. Such a process for preparing a Cry3Bb* crystal protein 
generally involves the steps of culturing a host cell which expresses the Cry3Bb* protein 
(such as a B. thuringiensis EG11221, EG11222, EG11223, EG11224, EG11225, EG11226, 

10 EG11227, EG11228, EG11229, EG11230, EG11231, EG11232, EG11233, EG11234, 
EG11235, EG11236, EG11237, EG11238, EG11239, EG11241, EG11242, EG11032, 
EG11035, EG11036, EG11046, EG11048, EG11051, EG11057, EG11058, EG11081, 
EG11082, EG11083, EG11084, EG11095, or EG11098 cell) under conditions effective to 
produce the crystal protein, and then obtaining the crystal protein so produced. 

1 5 The protein may be present within intact cells, and as such, no subsequent protein 

isolation or purification steps may be required. Alternatively, the cells may be broken, soni- 
cated, lysed, disrupted, or plasmolyzed to free the crystal protein(s) from the remaining cell 
debris. In such cases, one may desire to isolate, concentrate, or further purify the resulting 
crystals containing the proteins prior to use, such as, for example, in the formulation of in- 

20 secticidal compositions. The composition may ultimately be purified to consist almost en- 
tirely of the pure protein, or alternatively, be purified or isolated to a degree such that the 
composition comprises the crystal protein(s) in an amount of from between about 0.5% and 
about 99% by weight, or in an amount of from between about 5% and about 95% by weight, 
or in an amount of from between about 15% and about 85% by weight, or in an amount of 

25 from between about 25% and about 75% by weight, or in an amount of from between about 
40% and about 60% by weight etc. 

23 Recombinant Vectors Expressing cr y3 * Genes 

One important embodiment of the invention is a recombinant vector which com- 
30 prises a nucleic acid segment encoding one or more of the novel A thuringiensis crystal pro- 
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teins disclosed herein. Such a vector may be transferred to and replicated in a prokaryotic or 
eukaryotic host, with bacterial cells being particularly preferred as prokaryotic hosts, and 
plant cells being particularly preferred as eukaryotic hosts. 

In preferred embodiments, the recombinant vector comprises a nucleic acid seg- 
5 ment encoding the amino acid sequence of SEQ ID NO:2, SEQ ID NO:4, SEQ ID NO:6, 
SEQ ID NO:8, SEQ ID NO: 10, SEQ ID NO: 12, SEQ ID NO: 14, SEQ ID NO: 16, SEQ ID 
NO: 18, SEQ ID NO:20, SEQ ID NO:22, SEQ ID NO:24, SEQ ID NO:26, SEQ ID NO:28, 
SEQ ID NO:30, SEQ ID NO:32, SEQ ID NO:34, SEQ ID NO:36, SEQ ID NO:38, SEQ ID 
NO:40, SEQ ID NO:42, SEQ ID NO:44, SEQ ID NO:46, SEQ ID NO:48, SEQ ID NO:50, 

10 SEQ ID NO:52, SEQ ID NO:54, SEQ ID NO:56, SEQ ID NO:58, SEQ ID NO:60, SEQ ID 
NO:62, SEQ ID NO:64, SEQ ID NO:66, SEQ ID NO:68, SEQ ID NO:70, SEQ ID NO: 100, 
SEQ ID NO:102, or SEQ ID NO: 108. Highly preferred nucleic acid segments are those 
which have the sequence of SEQ ID NO: 1, SEQ ID NO:3, SEQ ID NO:5, SEQ ID NO:7, SEQ 
ID NO:9, SEQ ID NO:l 1, SEQ ID NO:13, SEQ ID NO:15, SEQ ID NO:17, SEQ ID NO:19, 

15 SEQ ID NO:21, SEQ ID NO:23, SEQ ID NO:25, SEQ ID NO:27, SEQ ID NO:29, SEQ ID 
NO:31, SEQ ID NO:33, SEQ ID NO:35, SEQ ID NO:37, SEQ ID NO:39, SEQ ID NO:41, 
SEQ ID NO:43, SEQ ID NO:45, SEQ ID NO:47, SEQ ID NO:49, SEQ ID NO:51, SEQ ID 
NO:53, SEQ ID NO:55, SEQ ID NO:57, SEQ ID NO:59, SEQ ID NO:61, SEQ ID NO:63, 
SEQ ID NO:65, SEQ ID NO:67, SEQ ID NO:69, SEQ ID NO:99, SEQ ID NO: 101, or SEQ ID 

20 NO:107. 

Another important embodiment of the invention is a transformed host cell which 
expresses one or more of these recombinant vectors. The host cell may be either prokaryotic 
or eukaryotic, and particularly preferred host cells are those which express the nucleic acid 
segment(s) comprising the recombinant vector which encode one or more B. thuringiensis 
25 crystal protein comprising modified amino acid sequences in one or more loop regions of 
domain 1, or between a helix 7 of domain 1 and (3 strand 1 of domain 2. Bacterial cells are 
particularly preferred as prokaryotic hosts, and plant cells are particularly preferred as eu- 
karyotic hosts 

In an important embodiment, the invention discloses and claims a host cell 
30 wherein the modified amino acid sequences comprise one or more loop regions between a 
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helices 1 and 2, a helices 2 and 3, a helices 3 and 4, a helices 4 and 5, a helices 5 and 6 or a 
helices 6 and 7 of domain 1, or between a helix 7 of domain 1 and (3 strand 1 of domain 2. A 
particularly preferred host cell is one that comprises the amino acid sequence of SEQ ID 
NO:2, SEQ ID NO:4, SEQ ID NO:6, SEQ ID NO:8, SEQ ID NO: 10, SEQ ID NO: 12, SEQ 
5 ID NO:14, SEQ ID NO:16, SEQ ID NO:18, SEQ ID NO:20, SEQ ID NO:22, SEQ ID NO:24, 
SEQ ID NO:26, SEQ ID NO:28, SEQ ID NO:30, SEQ ID NO:32, SEQ ID NO:34, SEQ ID 
NO:36, SEQ ID NO:38, SEQ ID NO:40, SEQ ID NO:42, SEQ ID NO:44, SEQ ID NO:46, 
SEQ ID NO:48, SEQ ID NO:50, SEQ ID NO:52, SEQ ID NO:54, SEQ ID NO:56, SEQ ID 
NO:58, SEQ ID NO:60, SEQ ID NO:62, SEQ ID NO:64, SEQ ID NO:66, SEQ ID NO:68, 

10 SEQ ID NO:70, SEQ ID NO: 100, SEQ ID NO: 102, or SEQ ID NO: 108, and more preferably, 
one that comprises the nucleic acid sequence of SEQ ID NO: 1 , SEQ ID NO:3, SEQ ID NO:5, 
SEQ ID NO:7, SEQ ID NO:9, SEQ ID NO:ll, SEQ ID NO: 13, SEQ ID NO: 15, SEQ ID 
NO:17, SEQ ID NO:19, SEQ ID NO:21, SEQ ID NO:23, SEQ ID NO:25, SEQ ID NO:27, 
SEQ ID NO:29, SEQ ID NO:31, SEQ ID NO:33, SEQ ID NO:35, SEQ ID NO:37, SEQ ID 

15 NO:39, SEQ ID NO:41, SEQ ID NO:43, SEQ ID NO:45, SEQ ID NO:47, SEQ ID NO:49, 
SEQ ID NO:51, SEQ ID NO:53, SEQ ID NO:55, SEQ ID NO:57, SEQ ID NO:59, SEQ ID 
NO:61, SEQ ID NO:63, SEQ ID NO:65, SEQ ID NO:67, SEQ ID NO:69, SEQ ID NO:99, 
SEQ ID NO: 1 0 1 , or SEQ ID NO: 1 07. 

Bacterial host cells transformed with a nucleic acid segment encoding a modified 

20 Cry3Bb crystal protein according to the present invention are disclosed and claimed herein, 
and in particular, a B. thuringiensis cell having designation EG 11221, EG 11222, EG 11223, 
EG11224, EG11225, EG11226, EG11227, EG11228, EG11229, EG11230, EG11231, 
EG11232, EG11233, EG11234, EG11235, EG11236, EG11237, EG11238, EG11239, 
EG11241, EG11242, EG11032, EG11035, EG11036, EG11046, EG11048, EG11051, 

25 EG1 1057, EG1 1058, EG1 1081, EG1 1082, EG1 1083, EG1 1084, EG1 1095, or EG1 1098. 

In another embodiment, the invention encompasses a method of using a nucleic 
acid segment of the present invention that encodes a cry3Bb* gene. The method generally 
comprises the steps of: (a) preparing a recombinant vector in which the cry3Bb* gene is po- 
sitioned under the control of a promoter; (b) introducing the recombinant vector into a host 

30 cell; (c) culturing the host cell under conditions effective to allow expression of the Cry3Bb* 
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crystal protein encoded by said cry3Bb* gene; and (d) obtaining the expressed Cry3Bb* 
crystal protein or peptide. 

A wide variety of ways are available for introducing a 5. thuringiensis gene ex- 
pressing a toxin into the microorganism host under conditions which allow for stable mainte- 
5 nance and expression of the gene. One can provide for DNA constructs which include the 
transcriptional and translational regulatory signals for expression of the toxin gene, the toxin 
gene under their regulatory control and a DNA sequence homologous with a sequence in the 
host organism, whereby integration will occur, and/or a replication system which is func- 
tional in the host, whereby integration or stable maintenance will occur. 

10 The transcriptional initiation signals will include a promoter and a transcriptional 

initiation start site. In some instances, it may be desirable to provide for regulative expres- 
sion of the toxin, where expression of the toxin will only occur after release into the envi- 
ronment. This can be achieved with operators or a region binding to an activator or en- 
hancers, which are capable of induction upon a change in the physical or chemical environ- 

15 ment of the microorganisms. For example, a temperature sensitive regulatory region may be 
employed, where the organisms may be grown up in the laboratory without expression of a 
toxin, but upon release into the environment, expression would begin. Other techniques may 
employ a specific nutrient medium in the laboratory, which inhibits the expression of the 
toxin, where the nutrient medium in the environment would allow for expression of the toxin. 

20 For translational initiation, a ribosomal binding site and an initiation codon will be present. 

Various manipulations may be employed for enhancing the expression of the mes- 
senger RNA, particularly by using an active promoter, as well as by employing sequences, 
which enhance the stability of the messenger RNA. The transcriptional and translational 
termination region will involve stop codon(s), a terminator region, and optionally, a poly- 

25 adenylation signal. A hydrophobic "leader" sequence may be employed at the amino termi- 
nus of the translated polypeptide sequence in order to promote secretion of the protein across 
the inner membrane. 

In the direction of transcription, namely in the 5' to 3' direction of the coding or 
sense sequence, the construct will involve the transcriptional regulatory region, if any, and 
30 the promoter, where the regulatory region may be either 5' or 3' of the promoter, the ribo- 
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somal binding site, the initiation codon, the structural gene having an open reading frame in 
phase with the initiation codon, the stop codon(s), the polyadenylation signal sequence, if 
any, and the terminator region. This sequence as a double strand may be used by itself for 
transformation of a microorganism host, but will usually be included with a DNA sequence 
5 involving a marker, where the second DNA sequence may be joined to the toxin expression 
construct during introduction of the DNA into the host. 

• By a marker is intended a structural gene which provides for selection of those 
hosts which have been modified or transformed. The marker will normally provide for se- 
lective advantage, for example, providing for biocide resistance, e.g., resistance to antibiotics 

10 or heavy metals; complementation, so as to provide prototropy to an auxotrophic host, or the 
like. Preferably, complementation is employed, so that the modified host may not only be 
selected, but may also be competitive in the field. One or more markers may be employed in 
the development of the constructs, as well as for modifying the host. The organisms may be 
further modified by providing for a competitive advantage against other wild-type microor- 

15 ganisms in the field. For example, genes expressing metal chelating agents, e.g., sidero- 
phores, may be introduced into the host along with the structural gene expressing the toxin. 
In this manner, the enhanced expression of a siderophore may provide for a competitive ad- 
vantage for the toxin-producing host, so that it may effectively compete with the wild-type 
microorganisms and stably occupy a niche in the environment. 

20 Where no functional replication system is present, the construct will also include a 

sequence of at least 50 basepairs (bp), preferably at least about 100 bp, more preferably at 
least about 1000 bp, and usually not more than about 2000 bp of a sequence homologous 
with a sequence in the host. In this way, the probability of legitimate recombination is en- 
hanced, so that the gene will be integrated into the host and stably maintained by the host. 

25 Desirably, the toxin gene will be in close proximity to the gene providing for complementa- 
tion as well as the gene providing for the competitive advantage. Therefore, in the event that 
a toxin gene is lost, the resulting organism will be likely to also lost the complementing gene 
and/or the gene providing for the competitive advantage, so that it will be unable to compete 
in the environment with the gene retaining the intact construct. 
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A large number of transcriptional regulatory regions are available from a wide 
variety of microorganism hosts, such as bacteria, bacteriophage, cyanobacteria, algae, fungi, 
and the like. Various transcriptional regulatory regions include the regions associated with 
the trp gene, lac gene, gal gene, the X L and X R promoters, the tac promoter, the naturally- 
5 occurring promoters associated with the 5-endotoxin gene, where functional in the host. See 
for example, U. S. Patents 4,332,898; 4,342,832; and 4,356,270 (each of which is specifically 
incorporated herein by reference). The termination region may be the termination region 
normally associated with the transcriptional initiation region or a different transcriptional ini- 
tiation region, so long as the two regions are compatible and functional in the host. 

10 Where stable episomal maintenance or integration is desired, a plasmid will be 

employed which has a replication system which is functional in the host. The replication 
system may be derived from the chromosome, an episomal element normally present in the 
host or a different host, or a replication system from a virus which is stable in the host. A 
large number of plasmids are available, such as pBR322, pACYC184, RSF1010, pR01614, 

15 and the like. See for example, Olson et al (1982); Bagdasarian et al. (1981), Baum et al., 
1990, and U. S. Patents 4,356,270; 4,362,817; 4,371,625, and 5,441,884, each incorporated 
specifically herein by reference. 

The B. thuringiensis gene can be introduced between the transcriptional and 
translational initiation region and the transcriptional and translational termination region, so 

20 as to be under the regulatory control of the initiation region. This construct will be included 
in a plasmid, which will include at least one replication system, but may include more than 
one, where one replication system is employed for cloning during the development of the 
plasmid and the second replication system is necessary for functioning in the ultimate host. 
In addition, one or more markers may be present, which have been described previously. 

25 Where integration is desired, the plasmid will desirably include a sequence homologous with 
the host genome. 

The transformants can be isolated in accordance with conventional ways, usually 
employing a selection technique, which allows for selection of the desired organism as 
against unmodified organisms or transferring organisms, when present. The transformants 
30 then can be tested for pesticidal activity. If desired, unwanted or ancillary DNA sequences 

-40- 

A 13553H2WKVOPDOO 



may be selectively removed from the recombinant bacterium by employing site-specific re- 
combination systems, such as those described in U. S. Patent 5,441,884 (specifically incorpo- 
rated herein by reference). 

5 2.4 cry3 DNA Segments 

A B. thuringiensis cry 3 * gene encoding a crystal protein having one or more mu- 
tations in one or more regions of the peptide represents an important aspect of the invention. 
Preferably, the cry 3* gene encodes an amino acid sequence in which one or more amino acid 
residues have been changed based on the methods disclosed herein, and particularly those 

10 changes which have been made for the purpose of altering the insecticidal activity or speci- 
ficity of the crystal protein. 

In accordance with the present invention, nucleic acid sequences include and are 
not limited to DNA, including and not limited to cDNA and genomic DNA, genes; RNA, in- 
cluding and not limited to mRNA and tRNA; antisense sequences, nucleosides, and suitable 

15 nucleic acid sequences such as those set forth in SEQ ID NO:l, SEQ ID NO:3, SEQ ID NO:5, 
SEQ ID NO:7, SEQ ID NO:9, SEQ ID NO:ll, SEQ ID NO: 13, SEQ ID NO: 15, SEQ ID 
NO:17, SEQ ID NO:19, SEQ ID NO:21, SEQ ID NO:23, SEQ ID NO:25, SEQ ID NO:27, 
SEQ ID NO:29, SEQ ID NO:31, SEQ ID NO:33, SEQ ID NO:35, SEQ ID NO:37, SEQ ID 
NO:39, SEQ ID NO:41, SEQ ID NO:43, SEQ ID NO:45, SEQ ID NO:47, SEQ ID NO:49, 

20 SEQ ID NO:51, SEQ ID NO:53, SEQ ID NO:55, SEQ ID NO:57, SEQ ID NO:59, SEQ ID 
NO:61, SEQ ID NO:63, SEQ ID NO:65, SEQ ID NO:67, SEQ ID NO:69, SEQ ID NO:99, 
SEQ ID NO: 101, or SEQ ID NO: 107, and alterations in the nucleic acid sequences including 
alterations, deletions, mutations, and homologs capable of expressing the B. thuringiensis 
modified toxins of the present invention. 

25 As such the present invention also concerns DNA segments, that are free from 

total genomic DNA and that encode the novel synthetically-modified crystal proteins 
disclosed herein. DNA segments encoding these peptide species may prove to encode 
proteins, polypeptides, subunits, functional domains, and the like of crystal protein-related or 
other non-related gene products. In addition these DNA segments may be synthesized 

30 entirely in vitro using methods that are well-known to those of skill in the art. 
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As used herein, the term M DNA segment" refers to a DNA molecule that has been 
isolated free of total genomic DNA of a particular species. Therefore, a DNA segment 
encoding a crystal protein or peptide refers to a DNA segment that contains crystal protein 
coding sequences yet is isolated away from, or purified free from, total genomic DNA of the 
5 species from which the DNA segment is obtained, which in the instant case is the genome of 
the Gram-positive bacterial genus, Bacillus, and in particular, the species of Bacillus known 
as 5. thuringiensis. Included within the term "DNA segment", are DNA segments and 
smaller fragments of such segments, and also recombinant vectors, including, for example, 
plasmids, cosmids, phagemids, phage, viruses, and the like. 

10 Similarly, a DNA segment comprising an isolated or purified crystal protein- 

encoding gene refers to a DNA segment which may include in addition to peptide encoding 
sequences, certain other elements such as, regulatory sequences, isolated substantially away 
from other naturally occurring genes or protein-encoding sequences. In this respect, the term 
"gene" is used for simplicity to refer to a functional protein-, polypeptide- or peptide- 

15 encoding unit. As will be understood by those in the art, this functional term includes both 
genomic sequences, operon sequences and smaller engineered gene segments that express, or 
may be adapted to express, proteins, polypeptides or peptides. 

"Isolated substantially away from other coding sequences" means that the gene of 
interest, in this case, a gene encoding a bacterial crystal protein, forms the significant part of 

20 the coding region of the DNA segment, and that the DNA segment does not contain large 
portions of naturally-occurring coding DNA, such as large chromosomal fragments or other 
functional genes or operon coding regions. Of course, this refers to the DNA segment as 
originally isolated, and does not exclude genes, recombinant genes, synthetic linkers, or 
coding regions later added to the segment by the hand of man. 

25 Particularly preferred DNA sequences are those encoding Cry3Bb.60, 

Cry3Bb. 1 1221, Cry3Bb.l 1222, Cry3Bb.l 1223, Cry3Bb.l 1224, Cry3Bb.l 1225, 
Cry3Bb.l 1226, Cry3Bb.l 1227, Cry3Bb.l 1228, Cry3Bb.l 1229, Cry3Bb.l 1230, 
Cry3Bb.ll231, Cry3Bb.ll232, Cry3Bb.ll233, Cry3Bb.ll234, Cry3Bb.l 1235, 
Cry3Bb.l 1236, Cry3Bb.l 1237, Cry3Bb.l 1238, Cry3Bb. 1 1239, Cry3Bb.l 1241, 

30 Cry3Bb.l 1242, Cry3Bb.l 1032, Cry3Bb.l 1035, Cry3Bb.l 1036, Cry3Bb.l 1046, 
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Cry3Bb.ll048, Cry3Bb.l 1051, Cry3Bb.l 1057, Cry3Bb.l 1058, Cry3Bb.l 1081, 
Cry3Bb.ll082, Cry3Bb.l 1083, Cry3Bb.l 1084, Cry3Bb.ll095 and Cry3Bb.ll098 crystal 
proteins, and in particular cry3Bb* genes such as cry3Bb.60, cry3Bb.l 1221, cry3Bb.l 1222, 
cry3Bb.ll223, cry3Bb.l 1224, cry3Bb.l 1225, cry3Bb.l 1226, cry3Bb.l 1227, cry3Bb.l 1228, 
5 cry3Bb.U229, cry3Bb.l 1230, cry3Bb.l 1231, cry3Bb.l 1232, cry3Bb.l 1233, cry3Bb.l 1234, 
cry3Bb.!1235, cry3Bb.l 1 236, cry3Bb.l 1237, cry3Bb.l 1238, cry3Bb.l 1239, cry3Bb.l 1241, 
cry3Bb.U242, cry3Bb.l 1032, cry3Bb.l 1035, cry3Bb.l 1036, cry3Bb.l 1046, cry3Bb.l 1048, 
cry3Bb.U051, cry3Bb.l 1057, cry3Bb.l 1058, cry3Bb.l 1081, cry3Bb.l 1082, cry3Bb.l 1083, 
cry3Bb.J 1084, cry3Bb. 11095 and cry3Bb.l 1098. In particular embodiments, the invention 

10 concerns isolated DNA segments and recombinant vectors incorporating DNA sequences that 
encode a Cry peptide species that includes within its amino acid sequence an amino acid 
sequence essentially as set forth in SEQ ID NO:2, SEQ ID NO:4, SEQ ID NO:6, SEQ ID 
NO:8, SEQ ID NO:10, SEQ ID NO:12, SEQ ID NO: 14, SEQ ID NO:16, SEQ ID NO: 18, SEQ 
ID NO:20, SEQ ID NO:22, SEQ ID NO:24, SEQ ID NO:26, SEQ ID NO:28, SEQ ID NO:30, 

15 SEQ ID NO:32, SEQ ID NO:34, SEQ ID NO:36, SEQ ID NO:38, SEQ ID NO:40, SEQ ID 
NO:42, SEQ ID NO:44, SEQ ID NO:46, SEQ ID NO:48, SEQ ID NO:50, SEQ ID NO:52, 
SEQ ID NO:54, SEQ ID NO:56, SEQ ID NO:58, SEQ ID NO:60, SEQ ID NO:62, SEQ ID 
NO:64, SEQ ID NO:66, SEQ ID NO:68, SEQ ID NO:70, SEQ ID NO: 100, SEQ ID NO: 102, 
or SEQ ID NO: 108. 

20 The term "a sequence essentially as set forth in SEQ ID NO:2, SEQ ID NO:4, 

SEQ ID NO:6, SEQ ID NO:8, SEQ ID NO:10, SEQ ID NO:12, SEQ ID NO:14, SEQ ID 
NO: 16, SEQ ID NO: 18, SEQ ID NO:20, SEQ ID NO:22, SEQ ID NO:24, SEQ ID NO:26, 
SEQ ID NO:28, SEQ ID NO:30, SEQ ID NO:32, SEQ ID NO:34, SEQ ID NO:36, SEQ ID 
NO:38, SEQ ID NO:40, SEQ ID NO:42, SEQ ID NO:44, SEQ ID NO:46, SEQ ID NO:48, 

25 SEQ ID NO:50, SEQ ID NO:52, SEQ ID NO:54, SEQ ID NO:56, SEQ ID NO:58, SEQ ID 
NO:60, SEQ ID NO:62, SEQ ID NO:64, SEQ ID NO:66, SEQ ID NO:68, SEQ ID NO:70, 
SEQ ID NO: 100, SEQ ID NO: 102, or SEQ ID NO: 108" means that the sequence substantially 
corresponds to a portion of the sequence of SEQ ID NO:2, SEQ ID NO:4, SEQ ID NO:6, 
SEQ ID NO:8, SEQ ID NO: 10, SEQ ID NO: 12, SEQ ID NO: 14, SEQ ID NO: 16, SEQ ID 

30 NO: 18, SEQ ID NO:20, SEQ ID NO:22, SEQ ID NO:24, SEQ ID NO:26, SEQ ID NO:28, 
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SEQ ID NO:30, SEQ ID NO:32, SEQ ID NO:34, SEQ ID NO:36. SEQ ID NO:38, SEQ ID 
NO:40, SEQ ID NO:42, SEQ ID NO:44, SEQ ID NO:46, SEQ ID NO:48, SEQ ID NO:50, 
SEQ ID NO:52, SEQ ID NO:54, SEQ ID NO:56, SEQ ID NO:58, SEQ ID NO:60, SEQ ID 
NO:62, SEQ ID NO:64, SEQ ID NO:66, SEQ ID NO:68, SEQ ID NO:70, SEQ ID NO: 100, 
5 SEQ ID NO:102, or SEQ ID NO: 108, and has relatively few amino acids that are not 
identical to, or a biologically functional equivalent of, the amino acids of any of these 
sequences. The term "biologically functional equivalent" is well understood in the art and is 
further defined in detail herein (e.g., see Illustrative Embodiments). 

Accordingly, sequences that have between about 70% and about 75% or between 

10 about 75% and about 80%, or more preferably between about 81% and about 90%, or even 
more preferably between about 91% or 92% or 93% and about 97% or 98% or 99% amino 
acid sequence identity or functional equivalence to the amino acids of SEQ ID NO:2, SEQ ID 
NO:4, SEQ ID NO:6, SEQ ID NO:8, SEQ ID NO: 10, SEQ ID NO: 12, SEQ ID NO: 14, SEQ 
ID NO: 16, SEQ ID NO: 18, SEQ ID NO:20, SEQ ID NO:22, SEQ ID NO:24, SEQ ID NO:26, 

15 SEQ ID NO:28, SEQ ID NO:30, SEQ ID NO:32, SEQ ID NO:34, SEQ ID NO:36, SEQ ID 
NO:38, SEQ ID NO:40, SEQ ID NO:42, SEQ ID NO:44, SEQ ID NO:46, SEQ ID NO:48, 
SEQ ID NO:50, SEQ ID NO:52, SEQ ID NO:54, SEQ ID NO:56, SEQ ID NO:58, SEQ ID 
NO:60, SEQ ID NO:62, SEQ ID NO:64, SEQ ID NO:66, SEQ ID NO:68, SEQ ID NO:70, 
SEQ ID NO:100, SEQ ID NO:102 or SEQ ID NO:108 will be sequences that are "essentially 

20 as set forth in SEQ ID NO:2, SEQ ID NO:4, SEQ ID NO:6, SEQ ID NO:8, SEQ ID NO: 10, 
SEQ ID NO: 12, SEQ ID NO: 14, SEQ ID NO: 16, SEQ ID NO: 18, SEQ ID NO:20, SEQ ID 
NO:22, SEQ ID NO:24, SEQ ID NO:26, SEQ ID NO:28, SEQ ID NO:30, SEQ ID NO:32, 
SEQ ID NO:34, SEQ ID NO:36, SEQ ID NO:38, SEQ ID NO:40, SEQ ID NO:42, SEQ ID 
NO:44, SEQ ID NO:46, SEQ ID NO:48, SEQ ID NO:50, SEQ ID NO:52, SEQ ID NO:54, 

25 SEQ ID N0.56, SEQ ID NO:58, SEQ ID NO:60, SEQ ID NO:62, SEQ ID NO:64, SEQ ID 
NO:66, SEQ ID NO:68, SEQ ID NO:70, SEQ ID NO.100, SEQ ID NO: 102, or SEQ ID 
NO: 108." 

It will also be understood that amino acid and nucleic acid sequences may include 
additional residues, such as additional N- or C-terminal amino acids or 5' or 3' sequences. 
30 and yet still be essentially as set forth in one of the sequences disclosed herein, so long as the 
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sequence meets the criteria set forth above, including the maintenance of biological protein 
activity where protein expression is concerned. The addition of terminal sequences 
particularly applies to nucleic acid sequences that may, for example, include various non- 
coding sequences flanking either of the 5' or 3 r portions of the coding region or may include 
5 various internal sequences, i.e., introns, which are known to occur within genes. 

The nucleic acid segments of the present invention, regardless of the length of the 
coding sequence itself, may be combined with other DNA sequences, such as promoters, 
polyadenylation signals, additional restriction enzyme sites, multiple cloning sites, other 
coding segments, and the like, such that their overall length may vary considerably. It is 

10 therefore contemplated that a nucleic acid fragment of almost any length may be employed, 
with the total length preferably being limited by the ease of preparation and use in the 
intended recombinant DNA protocol. 

For example, nucleic acid fragments may be prepared that include a short 
contiguous stretch encoding the peptide sequence disclosed in SEQ ID NO:2, SEQ ID NO:4, 

15 SEQ ID NO:6, SEQ ID NO:8, SEQ ID NO:10, SEQ ID NO:12, SEQ ID NO: 14, SEQ ID 
NO: 16, SEQ ID NO: 18, SEQ ID NO:20, SEQ ID NO:22, SEQ ID NO:24, SEQ ID NO:26, 
SEQ ID NO:28, SEQ ID NO:30, SEQ ID NO:32, SEQ ID NO:34, SEQ ID NO:36, SEQ ID 
NO:38, SEQ ID NO:40, SEQ ID NO:42, SEQ ID NO:44, SEQ ID NO:46, SEQ ID NO:48, 
SEQ ID NO:50, SEQ ID NO:52, SEQ ID NO:54, SEQ ID NO:56, SEQ ID NO:58, SEQ ID 

20 NO:60, SEQ ID NO:62, SEQ ID NO:64, SEQ ID NO:66, SEQ ID NO:68, SEQ ID NO:70, 
SEQ ID NO: 100, SEQ ID NO: 102, or SEQ ID NO: 108, or that are identical to or 
complementary to DNA sequences which encode the peptide disclosed in SEQ ID NO:2, 
SEQ ID NO:4, SEQ ID NO:6, SEQ ID NO:8, SEQ ID NO: 10, SEQ ID NO: 12, SEQ ID 
NO:14, SEQ ID N0.16, SEQ ID NO:18, SEQ ID NO:20, SEQ ID NO:22, SEQ ID NO:24, 

25 SEQ ID NO:26, SEQ ID NO:28, SEQ ID NO:30, SEQ ID NO:32, SEQ ID NO:34, SEQ ID 
NO:36, SEQ ID NO:38, SEQ ID NO:40, SEQ ID NO:42, SEQ ID NO:44, SEQ ID NO:46, 
SEQ ID NO:48, SEQ ID NO:50, SEQ ID NO:52, SEQ ID NO:54, SEQ ID NO:56, SEQ ID 
NO:58, SEQ ID NO:60, SEQ ID NO:62, SEQ ID NO:64, SEQ ID NO:66, SEQ ID NO:68, 
SEQ ID NO:70, SEQ ID NO:100, SEQ ID NO:102, or SEQ ID NO:108, and particularly the 

30 DNA segments disclosed in SEQ ID NO: 1, SEQ ID NO:3, SEQ ID NO:5, SEQ ID NO:7, SEQ 
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ID N0:9, SEQ ID NO: 1 1, SEQ ID NO: 13, SEQ ID NO: 15, SEQ ID NO: 17, SEQ ID NO: 19, 
SEQ ID NO:21, SEQ ID NO:23, SEQ ID NO:25, SEQ ID NO:27, SEQ ID NO:29, SEQ ID 
NO:31, SEQ ID NO:33, SEQ ID NO:35, SEQ ID NO:37, SEQ ID NO:39, SEQ ID NO:41, 
SEQ ID NO:43, SEQ ID NO:45, SEQ ID NO:47, SEQ ID NO:49, SEQ ID NO:51, SEQ ID 
5 NO:53, SEQ ID NO:55, SEQ ID NO:57, SEQ ID NO:59, SEQ ID NO:61, SEQ ID NO:63, 
SEQ ID NO:65, SEQ ID NO:67, SEQ ID NO:69, SEQ ID NO:99, SEQ ID NO: 101, or SEQ ID 
NO:107. 

Highly preferred nucleic acid segments of the present invention comprise one or 
more cry genes of the invention, or a portion of one or more cry genes of the invention. For 

10 certain application, relatively small contiguous nucleic acid sequences are preferable, such as 
those which are about 14 or 15 or 16 or 17 or 18 or 19, or 20, or 30-50, 51-80, 81-100 or so 
nucleotides in length. Alternatively, in some embodiments, and particularly those involving 
preparation of recombinant vectors, transformation of suitable host cells, and preparation of 
transgenic plant cell, longer nucleic acid segments are preferred, particularly those that 

15 include the entire coding region of one or more cry genes. As such, the preferred segments 
may include those that are up to about 20,000 or so nucleotides in length, or alternatively, 
shorter sequences such as those about 19,000, about 18,000, about 17,000, about 16,000, 
about 15,000, about 14,000, about 13,000, about 12,000, 11,000, about 10,000, about 9,000, 
about 8,000, about 7,000, about 6,000, about 5,000, about 4,500, about 4,000, about 3,500, 

20 about 3,000, about 2,500, about 2,000, about 1,500, about 1,000, about 500, or about 200 or 
so base pairs in length. Of course, these numbers are not intended to be exclusionary of all 
possible intermediate lengths in the range of from about 20,000 to about 15 nucleotides, as all 
of these intermediate lengths are also contemplated to be useful, and fall within the scope of 
the present invention. It will be readily understood that "intermediate lengths", in these 

25 contexts, means any length between the quoted ranges, such as 14, 15, 16, 17, 18, 19, 20, 

etc.; 21, 22, 23, 24, 25, 26, 27, 28, 29, etc.; 30, 31, 32, 33, 34, 35, 36 etc.; 40, 41, 42, 43, 

44 etc., 50, 51, 52, 53 etc.; 60, 61, 62, 63.... eft:., 70, 80, 90, 100, 110, 120, 130 etc.; 

200, 210, 220, 230, 240, 250 etc.; including all integers in the entire range from about 14 

to about 10,000, including those integers in the ranges 200-500; 500-1,000; 1,000-2,000; 

30 2,000-3,000; 3,000-5,000 and the like. 
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In a preferred embodiment, the nucleic acid segments comprise a sequence of 
from about 1800 to about 18,000 base pair in length, and comprise one or more genes which 
encode a modified Cry 3* polypeptide disclosed herein which has increased activity against 
Coleopteran insect pests. 
5 It will also be understood that this invention is not limited to the particular nucleic 

acid sequences which encode peptides of the present invention, or which encode the amino 
acid sequence of SEQ ID NO:2, SEQ ID NO:4, SEQ ID NO:6, SEQ ID NO:8, SEQ ID 
NO:10, SEQ ID NO:12, SEQ ID NO: 14, SEQ ID NO: 16, SEQ ID NO:18, SEQ ID NO:20, 
SEQ ID NO:22, SEQ ID NO:24, SEQ ID NO:26, SEQ ID NO:28, SEQ ID NO:30, SEQ ID 

10 NO:32, SEQ ID NO:34, SEQ ID NO:36, SEQ ID NO:38, SEQ ID NO:40, SEQ ID NO:42, 
SEQ ID NO:44, SEQ ID NO:46, SEQ ID NO:48, SEQ ID NO:50, SEQ ID NO:52, SEQ ID 
NO:54, SEQ ID NO:56, SEQ ID NO:58, SEQ ID NO:60, SEQ ID NO:62, SEQ ID NO:64, 
SEQ ID NO:66, SEQ ID NO:68, SEQ ID NO:70, SEQ ID NO:100, SEQ ID NO:102, or SEQ 
ID NO: 108, including the DNA sequences which are particularly disclosed in SEQ ID NO:l, 

15 SEQ ID NO:3, SEQ ID NO:5, SEQ ID NO:7, SEQ ID NO:9, SEQ ID NO:l 1, SEQ ID NO: 13, 
SEQ ID NO:15, SEQ ID NO:17, SEQ ID NO:19, SEQ ID NO:21, SEQ ID NO:23, SEQ ID 
NO:25, SEQ ID NO:27, SEQ ID NO:29, SEQ ID NO:31, SEQ ID NO:33, SEQ ID NO:35, 
SEQ ID NO:37, SEQ ID NO:39, SEQ ID NO:41, SEQ ID NO:43, SEQ ID NO:45, SEQ ID 
NO:47, SEQ ID NO:49, SEQ ID NO:51, SEQ ID NO:53, SEQ ID NO:55, SEQ ID NO:57, 

20 SEQ ID NO:59, SEQ ID NO:61, SEQ ID NO:63, SEQ ID NO:65, SEQ ID NO:67, SEQ ID 
NO:69, SEQ ID NO:99, SEQ ID NO:101, or SEQ ID NO:107. Recombinant vectors and 
isolated DNA segments may therefore variously include the peptide-coding regions 
themselves, coding regions bearing selected alterations or modifications in the basic coding 
region, or they may encode larger polypeptides that nevertheless include these peptide-coding 

25 regions or may encode biologically functional equivalent proteins or peptides that have 
variant amino acids sequences. 

The DNA segments of the present invention encompass biologically-functional, 
equivalent peptides. Such sequences may arise as a consequence of codon redundancy and 
functional equivalency that are known to occur naturally within nucleic acid sequences and 

30 the proteins thus encoded. Alternatively, functionally-equivalent proteins or peptides may be 
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created via the application of recombinant DNA technology, in which changes in the protein 
structure may be engineered, based on considerations of the properties of the amino acids 
being exchanged. Changes designed by man may be introduced through the application of 
site-directed mutagenesis techniques, e.g., to introduce improvements to the antigenicity of 
5 the protein or to test mutants in order to examine activity at the molecular level 

If desired, one may also prepare fusion proteins and peptides, e.g., where the 
peptide-coding regions are aligned within the same expression unit with other proteins or 
peptides having desired functions, such as for purification or immunodetection purposes 
(e.g., proteins that may be purified by affinity chromatography and enzyme label coding 

10 regions, respectively). 

Recombinant vectors form further aspects of the present invention. Particularly 
useful vectors are contemplated to be those vectors in which the coding portion of the DNA 
segment, whether encoding a full length protein or smaller peptide, is positioned under the 
control of a promoter. The promoter may be in the form of the promoter that is naturally 

15 associated with a gene encoding peptides of the present invention, as may be obtained by 
isolating the 5' non-coding sequences located upstream of the coding segment or exon, for 
example, using recombinant cloning and/or PCR™ technology, in connection with the 
compositions disclosed herein. 

20 2.5 Vectors, Host Cells, and Protein Expression 

In other embodiments, it is contemplated that certain advantages will be gained by 
positioning the coding DNA segment under the control of a recombinant, or heterologous, 
promoter. As used herein, a recombinant or heterologous promoter is intended to refer to a 
promoter that is not normally associated with a DNA segment encoding a crystal protein or 

25 peptide in its natural environment. Such promoters may include promoters normally 
associated with other genes, and/or promoters isolated from any bacterial, viral, eukaryotic, 
or plant cell. Naturally, it will be important to employ a promoter that effectively directs the 
expression of the DNA segment in the cell type, organism, or even animal, chosen for 
expression. The use of promoter and cell type combinations for protein expression is 

30 generally known to those of skill in the art of molecular biology, for example, see Sambrook 

-48- 

A I3553S(2WKVOI!DOC) 



et a/., 1989. The promoters employed may be constitutive, or inducible, and can be used 
under the appropriate conditions to direct high level expression of the introduced DNA 
segment, such as is advantageous in the large-scale production of recombinant proteins or 
peptides. Appropriate promoter systems contemplated for use in high-level expression 
5 include, but are not limited to, the Pichia expression vector system (Pharmacia LKB 
Biotechnology). 

In connection with expression embodiments to prepare recombinant proteins and 
peptides, it is contemplated that longer DNA segments will most often be used, with DNA 
segments encoding the entire peptide sequence being most preferred. However, it will be 

10 appreciated that the use of shorter DNA segments to direct the expression of crystal peptides 
or epitopic core regions, such as may be used to generate anti-crystal protein antibodies, also 
falls within the scope of the invention. DNA segments that encode peptide antigens from 
about 8, 9, 10, or 1 1 or so amino acids, and up to and including those of about 30, 40, or 50 
or so amino acids in length, or more preferably, from about 8 to about 30 amino acids in 

15 length, or even more preferably, from about 8 to about 20 amino acids in length are 
contemplated to be particularly useful. Such peptide epitopes may be amino acid sequences 
which comprise contiguous amino acid sequence from SEQ ID NO:2, SEQ ID NO:4, SEQ ID 
NO:6, SEQ ID NO:8, SEQ ID NO:I0, SEQ ID NO:12, SEQ ID NO:14, SEQ ID NO:16, SEQ 
ID NO:18, SEQ ID NO:20, SEQ ID NO:22, SEQ ID NO:24, SEQ ID NO:26, SEQ ID NO:28, 

20 SEQ ID NO:30, SEQ ID NO:32, SEQ ID NO:34, SEQ ID NO:36, SEQ ID NO:38, SEQ ID 
NO:40, SEQ ID NO:42, SEQ ID NO:44, SEQ ID NO:46, SEQ ID NO:48, SEQ ID NO:50, 
SEQ ID NO:52, SEQ ID NO:54, SEQ ID NO:56, SEQ ID NO:58, SEQ ID NO:60, SEQ ID 
NO:62, SEQ ID NO:64, SEQ ID NO:66, SEQ ID NO:68, SEQ ID NO:70, SEQ ID NO:100, 
SEQ ID NO:102, or SEQ ID NO:108. 

'25 

2.6 Transformed Host Cells and Transgenic Plants 

In one embodiment, the invention provides a transgenic plant having incorporated 
into its genome a transgene that encodes a contiguous amino acid sequence selected from the 
group consisting of SEQ ID NO:2, SEQ ID NO:4, SEQ ID NO:6. SEQ ID NO:8, SEQ ID 
30 NO:10, SEQ ID NO:12, SEQ ID NO:14. SEQ ID NO:16, SEQ ID NO:18, SEQ ID NO:20, 
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SEQ ID NO:22, SEQ ID NO:24, SEQ ID NO:26, SEQ ID NO:28, SEQ ID NO:30, SEQ ID 
NO:32, SEQ ID NO:34, SEQ ID NO:36, SEQ ID NO:38, SEQ ID NO:40, SEQ ID NO:42, 
SEQ ID NO:44, SEQ ID NO:46, SEQ ID NO:48, SEQ ID NO:50, SEQ ID NO:52, SEQ ID 
NO:54, SEQ ID NO:56, SEQ ID NO:58, SEQ ID NO:60, SEQ ID NO:62, SEQ ID NO:64, 
5 SEQ ID NO:66, SEQ ID NO:68, SEQ ID NO:70, SEQ ID NO: 100, SEQ ID NO: 102, and SEQ 
ID NO: 108. 

' A further aspect of the invention is a transgenic plant having incorporated into its 
genome a cry3Bb* transgene, provided the transgene comprises a nucleic acid sequence 
selected from the group consisting of SEQ ID NO:l, SEQ ID NO:3, SEQ ID NO:5. SEQ ID 

1 0 NO:7, SEQ ID NO:9, SEQ ID NO: 1 1 , SEQ ID NO: 13. SEQ ID NO: 1 5, SEQ ID NO: 1 7, SEQ 
ID NO: 19, SEQ ID N0.21, SEQ ID NO:23, SEQ ID NO:25, SEQ ID NO:27, SEQ ID NO:29, 
SEQ ID NO:31, SEQ ID NO:33, SEQ ID NO:35, SEQ ID NO:37, SEQ ID NO:39, SEQ ID 
NO:41, SEQ ID NO:43, SEQ ID NO:45, SEQ ID NO:47, SEQ ID NO:49, SEQ ID NO:51, 
SEQ ID NO:53, SEQ ID NO:55, SEQ ID NO:57, SEQ ID NO:59, SEQ ID NO:61, SEQ ID 

15 NO:63, SEQ ID NO:65, SEQ ID NO:67, SEQ ID NO:69, SEQ ID NO:99, SEQ ID NO.101, 
and SEQ ID NO: 107. Also disclosed and claimed are progeny of such a transgenic plant, as 
well as its seed, progeny from such seeds, and seeds arising from the second and subsequent 
generation plants derived from such a transgenic plant. 

The invention also discloses and claims host cells, both native, and genetically 

20 engineered, which express the novel cry3Bb* genes to produce Cry3Bb* polypeptides. Pre- 
ferred examples of bacterial host cells include B. thuringiensis EG 11 221, EG 11222, 
EG11223, EG11224, EG11225, EG11226, EG11227, EG11228, EG11229, EG11230, 
EGU231, EG11232, EG11233, EG11234, EG11235, EG11236, EG11237, EG11238, 
EG11239, EGU241, EG11242, EG11032, EG11035, EG11036, EG11046, EG11048, 

25 EG11051, EG11057, EG11058, EG11081, EG11082, EG11083, EG11084, EG11095, and 
EG 11098. 

Methods of using such cells to produce Cry3* crystal proteins are also disclosed. 
Such methods generally involve culturing the host cell (such as B. thuringiensis EG 1 1221, 
EG11222, EG11223, EG11224, EG11225, EG11226, EG11227, EG11228, EG11229, 
30 EG11230, EG11231, EG11232, EG11233, EG11234, EG11235, EG11236, EG11237. 
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EG11238, EG11239, EG11241, EG11242, EG11032, EG11035, EG11036, EG11046, 
EG11048, EG11051, EG11057, EG11058, EG11081, EG11082, EG11083, EG11084, or 
EGl 1095, or EG11098) under conditions effective to produce a Cry3* crystal protein, and 
obtaining the Cry3* crystal protein from said cell. 
5 In yet another aspect, the present invention provides methods for producing a 

transgenic plant which expresses a nucleic acid segment encoding the novel recombinant 
crystal proteins of the present invention. The process of producing transgenic plants is well- 
known in the art. In general, the method comprises transforming a suitable host cell with one 
or more DNA segments which contain one or more promoters operatively linked to a coding 

10 region that encodes one or more of the disclosed B. thuringiensis crystal proteins. Such a 
coding region is generally operatively linked to a transcription-terminating region, whereby 
the promoter is capable of driving the transcription of the coding region in the cell, and hence 
providing the cell the ability to produce the recombinant protein in vivo. Alternatively, in 
instances where it is desirable to control, regulate, or decrease the amount of a particular 

15 recombinant crystal protein expressed in a particular transgenic cell, the invention also 
provides for the expression of crystal protein antisense mRNA. The use of antisense mRN A 
as a means of controlling or decreasing the amount of a given protein of interest in a cell is 
well-known in the art. 

Another aspect of the invention comprises a transgenic plant which express a gene 

20 or gene segment encoding one or more of the novel polypeptide compositions disclosed 
herein. As used herein, the term "transgenic plant" is intended to refer to a plant that has 
incorporated DNA sequences, including but not limited to genes which are perhaps not 
normally present, DNA sequences not normally transcribed into RNA or translated into a 
protein ("expressed"), or any other genes or DNA sequences which one desires to introduce 

25 into the non- transformed plant, such as genes which may normally be present in the non- 
transformed plant but which one desires to either genetically engineer or to have altered 
expression. 

It is contemplated that in some instances the genome of a transgenic plant of the 
present invention will have been augmented through the stable introduction of one or more 
30 Cry3Bb*-encoding transgenes, either native, synthetically modified, or mutated. In some 
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instances, more than one transgene will be incorporated into the genome of the transformed 
host plant cell. Such is the case when more than one crystal protein-encoding DNA segment 
is incorporated into the genome of such a plant. In certain situations, it may be desirable to 
have one, two, three, four, or even more B. thuringiensis crystal proteins (either native or 
5 recombinantly-engineered) incorporated and stably expressed in the transformed transgenic 
plant. 

A preferred gene which may be introduced includes, for example, a crystal 
protein-encoding a DNA sequence from bacterial origin, and particularly one or more of 
those described herein which are obtained from Bacillus spp. Highly preferred nucleic acid 

10 sequences are those obtained from B. thuringiensis, or any of those sequences which have 
been genetically engineered to decrease or increase the insecticidal activity of the crystal 
protein in such a transformed host cell. 

Means for transforming a plant cell and the preparation of a transgenic cell line 
are well-known in the art, and are discussed herein. Vectors, plasmids, cosmids, YACs 

1 5 (yeast artificial chromosomes) and DNA segments for use in transforming such cells will, of 
course, generally comprise either the operons, genes, or gene-derived sequences of the 
present invention, either native, or synthetically-derived, and particularly those encoding the 
disclosed crystal proteins. These DNA constructs can further include structures such as 
promoters, enhancers, polylinkers, or even gene sequences which have positively- or 

20 negatively-regulating activity upon the particular genes of interest as desired. The DNA 
segment or gene may encode either a native or modified crystal protein, which will be 
expressed in the resultant recombinant cells, and/or which will impart an improved 
phenotype to the regenerated plant 

Such transgenic plants may be desirable for increasing the insecticidal resistance 

25 of a monocotyledonous or dicotyledonous plant, by incorporating into such a plant, a 
transgenic DNA segment encoding a Cry3Bb* crystal protein which is toxic to coleopteran 
insects. Particularly preferred plants include grains such as corn, wheat, rye, rice, barley, and 
oats; legumes such as soybeans; tubers such as potatoes; fiber crops such as flax and cotton; 
turf and pasture grasses; ornamental plants; shrubs; trees; vegetables, berries, citrus, fruits, 

30 cacti, succulents, and other commercially-important crops including garden and houseplants. 
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[n a related aspect, the present invention also encompasses a seed produced by the 
transformed plant, a progeny from such seed, and a seed produced by the progeny of the 
original transgenic plant, produced in accordance with the above process. Such progeny and 
seeds will have one or more crystal protein transgene(s) stably incorporated into its genome, 
5 and such progeny plants will inherit the traits afforded by the introduction of a stable 
transgene in Mendelian fashion. All such transgenic plants having incorporated into their 
genome transgenic DNA segments encoding one or more Cry3Bb* crystal proteins or 
polypeptides are aspects of this invention. Particularly preferred transgenes for the practice 
of the invention include nucleic acid segments comprising one or more cry3Bb* gene(s). 

10 

2.7 Biological Functional Equivalents 

Modification and changes may be made in the structure of the peptides of the 
present invention and DNA segments which encode them and still obtain a functional 
molecule that encodes a protein or peptide with desirable characteristics. The following is a 

1 5 discussion based upon changing the amino acids of a protein to create an equivalent, or even 
an improved, second-generation molecule. In particular embodiments of the invention, 
mutated crystal proteins are contemplated to be useful for increasing the insecticidal activity 
of the protein, and consequently increasing the insecticidal activity and/or expression of the 
recombinant transgene in a plant cell. The amino acid changes may be achieved by changing 

20 the codons of the DNA sequence, according to the codons given in Table 4. 



A: I3553H2WKVOHDOC) 



-53- 



Table 4 



Amino Acids Codons 



Alanine 


Ala 


A 


GCA 


GCC 


GCG 


GCU 


Cysteine 


Cys 


C 


UGC 


UGU 






Aspartic Acid 


Asp 


D 


GAC 


GAU 






Glutamic Acid 


Glu 


E 


GAA 


GAG 






Phenylalanine 


Phe 


F 


uuc 


uuu 






Glycine 


Gly 


G 


GGA 


GGC 


GGG 


GGU 


Histidine 


His 


H 


CAC 


CAU 






Iso leucine 


He 


I 


AUA 


AUC 


AUU 




Lysine 


Lys 


K 


AAA 


AAG 






Leucine 


Leu 


L 


UUA 


UUG 


CUA 


cue 


Methionine 


Met 


M 


AUG 








Asparagine 


Asn 


N 


AAC 


AAU 






Proline 


Pro 


P 


CCA 


CCC 


CCG 


ecu 


Glutamine 


Gin 


Q 


CAA 


CAG 






Arginine 


Arg 


R 


AGA 


AGG 


CGA 


CGC 


Serine 


Ser 


S 


AGC 


AGU 


UCA 


ucc 


Threonine 


Thr 


T 


ACA 


ACC 


ACG 


ACU 


Valine 


Val 


V 


GUA 


GUC 


GUG 


GUU 


Tryptophan 


Trp 


W 


UGG 








Tyrosine 


Tyr 


Y 


UAC 


UAU 







CUG CUU 



CGG CGU 
UCG UCU 



For example, certain amino acids may be substituted for other amino acids in a 
protein structure without appreciable loss of interactive binding capacity with structures such 
5 as, for example, antigen-binding regions of antibodies or binding sites on substrate 
molecules. Since it is the interactive capacity and nature of a protein that defines that 
protein's biological functional activity, certain amino acid sequence substitutions can be 
made in a protein sequence, and, of course, its underlying DNA coding sequence, and 
nevertheless obtain a protein with like properties. It is thus contemplated by the inventors 
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that various changes may be made in the peptide sequences of the disclosed compositions, or 

corresponding DNA sequences which encode said peptides without appreciable loss of their 

biological utility or activity. 

In making such changes, the hydropathic index of amino acids may be considered. 
5 The importance of the hydropathic amino acid index in conferring interactive biologic 

function on a protein is generally understood in the art (Kyte and Doolittle, 1982, incorporate 

herein by reference). It is accepted that the relative hydropathic character of the amino acid 

contributes to the secondary structure of the resultant protein, which in turn defines the 

interaction of the protein with other molecules, for example, enzymes, substrates, receptors, 
10 DNA, antibodies, antigens, and the like. 

Each amino acid has been assigned a hydropathic index on the basis of their 

hydrophobicity and charge characteristics (Kyte and Doolittle, 1982), these are: isoleucine 

(+4.5); valine (+4.2); leucine (+3.8); phenylalanine (+2.8); cysteine/cystine (+2.5); 

methionine (+1.9); alanine (+1.8); glycine (-0.4); threonine (-0.7); serine (-0.8); tryptophan 
15 (-0.9); tyrosine (-1.3); proline (-1.6); histidine (-3.2); glutamate (-3.5); glutamine (-3.5); 

aspartate (-3.5); asparagine (-3.5); lysine (-3.9); and arginine (-4.5). 

It is known in the art that certain amino acids may be substituted by other amino 

acids having a similar hydropathic index or score and still result in a protein with similar 

biological activity, /.e., still obtain a biological functionally equivalent protein. In making 
20 such changes, the substitution of amino acids whose hydropathic indices are within ±2 is 

preferred, those which are within ±1 are particularly preferred, and those within ±0.5 are even 

more particularly preferred. 

It is also understood in the art that the substitution of like amino acids can be 

made effectively on the basis of hydrophilicity. U. S. Patent 4,554,101, specifically 
25 incorporated herein by reference, states that the greatest local average hydrophilicity of a 

protein, as governed by the hydrophilicity of its adjacent amino acids, correlates with a 

biological property of the protein. 

As detailed in U. S. Patent 4,554,101, the following hydrophilicity values have 

been assigned to amino acid residues: arginine (+3.0); lysine (+3.0); aspartate (+3.0 + 1); 
30 glutamate (+3.0 ± 1); serine (+0.3); asparagine (+0.2); glutamine (+0.2); glycine (0); 
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threonine (-0.4); proline (-0.5 ± 1); alanine (-0.5); histidine (-0.5); cysteine (-1.0); 
methionine (-1.3); valine (-1.5); leucine (-1.8); isoleucine (-1.8); tyrosine (-2.3); 
phenylalanine (-2.5); tryptophan (-3.4). 

It is understood that an amino acid can be substituted for another having a similar 
5 hydrophilicity value and still obtain a biologically equivalent, and in particular, an 
immunologically equivalent protein. In such changes, the substitution of amino acids whose 
hydrophilicity values are within ±2 is preferred, those which are within ±1 are particularly 
preferred, and those within ±0.5 are even more particularly preferred. 

As outlined above, amino acid substitutions are generally therefore based on the 
10 relative similarity of the amino acid side-chain substituents, for example, their hydrophobic- 
ity, hydrophilicity, charge, size, and the like. Exemplary substitutions which take various of 
the foregoing characteristics into consideration are well known to those of skill in the art and 
include: arginine and lysine; glutamate and aspartate; serine and threonine; glutamine and 
asparagine; and valine, leucine and isoleucine. 

15 

3.0 Brief Description of the Drawings 

The drawings form part of the present specification and are included to further dem- 
onstrate certain aspects of the present invention. The invention may be better understood by 
reference to one or more of these drawings in combination with the detailed description of spe- 
20 cific embodiments presented herein. 

FIG. 1. Schematic representation of the monomelic structure of Cry3Bb. 

FIG. 2. Stereoscopic view of the monomelic structure of Cry3Bb with associ- 
ated water molecules (represented by dots). 

FIG. 3 A. Schematic representation of domain 1 of Cry3Bb 
25 FIG. 3B. Diagram of the positions of the 7 helices that comprise domain 1 . 

FIG. 4. Domain 1 of Cry3Bb is organized into seven a helices illustrated in 
FIG. 3 A (schematic representation) and FIG. 3B (schematic diagram). The a helices and 
amino acids residues are shown. 

FIG. 5 A. Schematic representation of domain 2 of Cry3Bb. 
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FIG, 5B. Diagram of the positions of the 1 1 p strands that compose the 3 
psheets of domain 2. 

FIG, 6. Domain 2 of Cry3Bb is a collection of three anti-parallel P sheets illus- 
trated in FIG. 5. The amino acids that define these sheets is listed below (a8, amino aids 
5 322-328, also is included in domain 2): 

FIG, 7A. Schematic representation of domain 3 of Cry3Bb. 

FIG. 7B. Diagram of the positions of the P strands that comprise domain 3. 

FIG, 8. Domain 3 (FIG. 7) is a loosely organized collection of P strands and 
loops; no p sheets are present. The P stands contain the amino acids limited below: 
10 FIG, 9A. A "side" view of the dimeric structure of Cry3Bb. The helical bundles 

of domains 1 can be seem in the middle of the molecule. 

FIG, 9B. A "top" view of the dimeric structure of Cry3Bb. The helical bundles 
of domains 1 can be seem in the middle of the molecule. 

FIG, 10. A graphic representation of the growth in conductance with time of 
15 channels formed by Cry3A and Cry3Bb in planar lipid bilayers. Cry3A forms channels with 
higher conductances much more rapidly than Cry3Bb. 

FIG. 11. A map of pEG1701 which contains the Cry3Bb gene with the crylF 

terminator. 

FIG. 12. The results of replicated 1-dose assays against SCRW larvae of 
20 Cry3Bb proteins altered in the 1B2,3 region. 

FIG. 13. The results of replicated, 1-dose assays against SCRW larvae of 
Cry3Bb proteins altered in the 1B6, 7 region. 

FIG. 14. The results of replicated, 1-dose screens against SCRW larvae of 
Cry3Bb proteins altered in the 1B10,1 1 region. 
25 FIG. 15. Single channel recordings of channels formed by Cry3Bb. 1 1 230 and 

WT Cry3Bb in planar lipid bilayers. Cry3Bb.l 1230 forms channels with well resolved open 
and closed states while Cry3Bb rarely does. 

FIG. 16. Single channel recordings of channels formed by Cry3Bb and 
Cry3Bb.60, a truzncated form of Cry3Bb. Cry3Bb.60 forms channels more quickly than 
30 Cry3Bb and, unlike Cry3Bb, produces channels with well resolved open and closed states. 
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FIG. 17A. Sequence alignment of the amino acid sequence of Cry3A, Cry3B, and 

Cry3C. 

FIG. 17B. Shown is a continuation of alignment of the amino acid sequence of 
Cry3A, Cry3B, and Cry3C shown in FIG. 17A. 
5 FIG. 17C. Shown is a continuation of alignment of the amino acid sequence of 

Cry3A, Cry3B, and Cry3C shown in FIG. 17A. 
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4.0 Description of Illustrative Embodiments 

The invention defines new B. thuringiensis (Bt) insecticidal 5-endotoxin proteins 
and the biochemical and biophysical strategies used to design the new proteins. Delta- 
endotoxins are a class of insecticdal proteins produced by B. thuringiensis that form cation- 
5 selective channels in planar lipid bilayers (English and Slatin, 1992). The new S-endotoxins 
are based on the parent structure of the coleopteran-active, 5-endotoxin Cry3Bb. Like other 
members of the coleopteran-active class of 5-endotoxins, including Cry3A and Cry3B, 
Cry3Bb exhibits excellent insecticidal activity against the Colorado Potato Beetle 
(Leptinotarsa decemlineata) . However, unlike Cry3A and Cry3B, Cry3Bb is also active 

10 against the southern corn rootworm or SCRW (Diabrotica undecimpunctata howardi Barber) 
and the western corn rootworm or WCRW {Diabrotica virgifera virgifera LeConte). The 
new insecticidal proteins described herein were specifically designed to improve the biologi- 
cal activity of the parent Cry3Bb protein. In addition, the design strategies themselves are 
novel inventions capable of being applied to and improving B. thuringiensis 5-endotoxins in 

1 5 general. B. thuringiensis 5-endotoxins are also members of a larger class of bacterial toxins 
that form ion channels (see English and Slatin 1992, for a review). The inventors, therefore, 
believe that these design strategies can also be applied to any biologically active, channel- 
forming protein to improve its biological properties. 

The designed Cry3Bb proteins were engineered using one or more of the folio w- 

20 ing strategies including (1) identification and alteration of protease-sensitive sites and prote- 
olytic processing; (2) analysis and manipulation of bound water; (3) manipulation of hydro- 
gen bonds around mobile regions; (4) loop analysis and loop redesign around flexible helices; 
(5) loop design around p strands and P sheets; (6) identification and redesign of complex 
electrostatic surfaces; (7) identification and removal of metal binding sites; (8) alteration of 

25 quaternary structure; (9) identification and design of structural residues; and (10) combina- 
tions of any and all sites defined by strategies 1-9. These design strategies permit the identi- 
fication and redesign of specific sites on Cry3Bb, ultimately creating new proteins with im- 
proved insecticidal activities. These new proteins are designated Cry3Bb designed proteins 
and are named Cry3Bb followed by a period and a suffix (e.g., Cry3Bb.60, Cry3Bb.l 1231). 

30 The new proteins are listed in Table 2 along with the specific sites on the molecule that were 
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modified, the aminoacid sequence changes at those sites that improve biological activity, the 
improved insecticidal activities and the design method used to identify that specific site. 

4.1 Some Advantages of the Invention 

5 Mutagenesis studies with cry genes have failed to identify a significant number of 

mutant crystal proteins which have improved broad-spectrum insecticidal activity, that is, 
with improved toxicity towards a range of insect pest species. Since agricultural crops are 
typically threatened by more than one insect pest species at any given time, desirable mutant 
crystal proteins are preferably those that exhibit improvements in toxicity towards multiple 

10 insect pest species. Previous failures to identify such mutants may be attributed to the choice 
of sites targeted for mutagenesis. For example, with respect to the related protein, CrylC, 
sites within domain 2 and domain 3 have been the principal targets of mutagenesis efforts, 
primarily because these domains are believed to be important for receptor binding and in de- 
termining insecticidal specificity (Aronson et al, 1995; Chen et al 1993; de Maagd et al, 

15 1996; Lee et al, 1992; Lee et al, 1995; Lu et al, 1994; Smedley and Ellar, 1996; Smith and 
Ellar, 1994; Rajamohan et al, 1995; Rajamohan et al, 1996) 

In contrast, the present inventors reasoned that the toxicity of Cry3 proteins, and 
specifically the toxicity of the Cry3Bb protein, may be improved against a broader array of 
target pests by targeting regions involved in ion channel function rather than regions of the 

20 molecule directly involved in receptor interactions, namely domains 2 and 3. Accordingly, 
the inventors opted to target regions within domain 1 of Cry3Bb for mutagenesis for the pur- 
pose of isolating Cry3Bb mutants with improved broad spectrum toxicity. Indeed, in the pre- 
sent invention, Cry3Bb mutants are described that show improved toxicity towards several 
coleopteran pests. 

25 At least one, and probably more than one, a helix of domain 1 is involved in the 

formation of ion channels and pores within the insect midgut epithelium (Gazit and Shai, 
1993; Gazit and Shai, 1995). Rather than target for mutagenesis the sequences encoding the 
a helices of domain 1 as others have (Wu and Aronson, 1992; Aronson et al, 1995; Chen et 
al, 1995), the present inventors opted to target exclusively sequences encoding amino acid 

30 residues adjacent to or lying within the predicted loop regions of Cry3Bb that separate these 
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a helices. Amino acid residues within these loop regions or amino acid residues capping the 
end of an a helix and lying adjacent to these loop regions may affect the spatial relationships 
among these a helices. Consequently, the substitution of these amino acid residues may re- 
sult in subtle changes in tertiary structure, or even quaternary structure, that positively impact 
5 the function of the ion channel. Amino acid residues in the loop regions of domain 1 are ex- 
posed to the solvent and thus are available for various molecular interactions. Altering these 
amino acids could result in greater stability of the protein by eliminating or occluding prote- 
ase-sensitive sites. Amino acid substitutions that change the surface charge of domain 1 
could alter ion channel efficiency or alter interactions with the brush border membrane or 

10 with other portions of the toxin molecule, allowing binding or insertion to be more effective. 

According to this invention, b#se substitutions are made in the underlying cry3Bb 
nucleic acid residues in order to change particular codons of the corresponding polypeptides, 
and particularly, in those loop regions between a-helices. The insecticidal activity of a crys- 
tal protein ultimately dictates the level of crystal protein required for effective insect control. 

1 5 The potency of an insecticidal protein should be maximized as much as possible in order to 
provide for its economic and efficient utilization in the field. The increased potency of an in- 
secticidal protein in a bioinsecticide formulation would be expected to improve the field per- 
formance of the bioinsecticide product. Alternatively, increased potency of an insecticidal 
protein in a bioinsecticide formulation may promote use of reduced amounts of bioinsecti- 

20 cide per unit area of treated crop, thereby allowing for more cost-effective use of the bioin- 
secticide product. When expressed in planta, the production of crystal proteins with im- 
proved insecticidal activity can be expected to improve plant resistance to susceptible insect 
pests. 

25 4.2 Methods for Culturing B. thvrisgiensis to Produce Crystal Proteins 

The B. thuringiensis strains described herein may be cultured using standard 
known media and fermentation techniques. Upon completion of the fermentation cycle, the 
bacteria may be harvested by first separating the B. thuringiensis spores and crystals from the 
fermentation broth by means well known in the art. The recovered B. thuringiensis spores 
30 and crystals can be formulated into a wettable powder, a liquid concentrate, granules or other 

-61- 

A: I3553X2WKVOU DOC) 



formulations by the addition of surfactants, dispersants, inert carriers and other components 
to facilitate handling and application for particular target pests. The formulation and appli- 
cation procedures are all well known in the art. 



5 4.3 Recombinant Host Cells For Expression of cry* Genes 

The nucleotide sequences of the subject invention can be introduced into a wide 
variety of microbial hosts. Expression of the toxin gene results, directly or indirectly, in the 
intracellular production and maintenance of the pesticide. With suitable hosts, e.g., Pseudo- 
monas, the microbes can be applied to the sites of coleopteran insects where they will prolif- 
10 erate and be ingested by the insects. The result is a control of the unwanted insects. Alterna- 
tively, the microbe hosting the toxin gene can be treated under conditions that prolong the 
activity of the toxin produced in the cell. The treated cell then can be applied to the envi- 
ronment of target pest(s). The resulting product retains the toxicity of the B. thuringiensis 
toxin. 

15 Suitable host cells, where the pesticide-containing cells will be treated to prolong 

the activity of the toxin in the cell when the then treated cell is applied to the environment of 
target pest(s), may include either prokaryotes or eukaryotes, normally being limited to those 
cells which do not produce substances toxic to higher organisms, such as mammals. How- 
ever, organisms which produce substances toxic to higher organisms could be used, where* 

20 the toxin is unstable or the level of application sufficiently low as to avoid any possibility or 
toxicity to a mammalian host. As hosts, of particular interest will be the prokaryotes and the 
lower eukaryotes, such as fungi. Illustrative prokaryotes, both Gram-negative and Gram- 
positive, include Enter obacteriaceae % such as Escherichia, Erwinia, Shigella, Salmonella, 
and Proteus; Bacillaceae; Rhizobiceae, such as Rhizobium; Spirillaceae, such as photobac- 

25 terium, Zymomonas, Serratia, Aeromonas, Vibrio, Desulfovibrio, Spirillum; Lactobacil- 
laceae; Pseudomonadaceae, such as Pseudomonas and Acetobacter; Azotobacteraceae, 
Actinomycetales, and Nitrobacteraceae. Among eukaryotes are fungi, such as Phycomycetes 
and Ascomycetes, which includes yeast, such as Saccharomyces and Schizosaccharomyces; 
and Basidiomycetes yeast, such as Rhodotorula, Aureobasidium, Sporobolomyces, and the 

30 like. 
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Characteristics of particular interest in selecting a host cell for purposes of pro- 
duction include ease of introducing the B. thuringiensis gene into the host, availability of ex- 
pression systems, efficiency of expression, stability of the pesticide in the host, and the pres- 
ence of auxiliary genetic capabilities. Characteristics of interest for use as a pesticide micro- 
5 capsule include protective qualities for the pesticide, such as thick cell walls, pigmentation, 
and intracellular packaging or formation of inclusion bodies; leaf affinity; lack of mammalian 
toxicity; attractiveness to pests for ingestion; ease of killing and fixing without damage to the 
toxin; and the like. Other considerations include ease of formulation and handling, econom- 
ics, storage stability, and the like. 

10 Host organisms of particular interest include yeast, such as Rhodotorula sp., 

Aureobasidium sp. t Saccharomyces sp., and Sporobolomyces sp.; phylloplane organisms 
such as Pseudomonas sp., Erwinia sp. and Flavobacterium sp.; or such other organisms as 
Escherichia, Lactobacillus sp., Bacillus sp., Streptomyces sp., and the like. Specific organ- 
isms include Pseudomonas aeruginosa, Pseudomonas fluorescens, Saccharomyces cere- 

15 visiae, B. thuringiensis, Escherichia coli, B. subtilis, B. megaterium, B. cereus, Streptomyces 
lividans and the like. 

Treatment of the microbial cell, e.g., a microbe containing the B. thuringiensis 
toxin gene, can be by chemical or physical means, or by a combination of chemical and/or 
physical means, so long as the technique does not deleteriously affect the properties of the 

20 toxin, nor diminish the cellular capability in protecting the toxin. Examples of chemical rea- 
gents are halogenating agents, particularly halogens of atomic no. 17-80. More particularly, 
iodine can be used under mild conditions and for sufficient time to achieve the desired re- 
sults. Other suitable techniques include treatment with aldehydes, such as formaldehyde and 
glutaraldehye; anti-infectives, such as zephiran chloride and cetylpyridinium chloride; alco- 

25 hols, such as isopropyl and ethanol; various histologic fixatives, such as Lugol's iodine, 
Bouin's fixative, and Helly's fixatives, (see e.g., Humason, 1967); or a combination of physi- 
cal (heat) and chemical agents that preserve and prolong the activity of the toxin produced in 
the cell when the cell is administered to the host animal. Examples of physical means are 
short wavelength radiation such as y-radiation and X-radiation, freezing, UV irradiation, ly- 

30 ophilization, and the like. ' The cells employed will usually be intact and be substantially in 
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the proliferative form when treated, rather than in a spore form, although in some instances 
spores may be employed. 

Where the B. thuringiensis toxin gene is introduced via a suitable vector into a 
microbial host, and said host is applied to the environment in a living state, it is essential that 
5 certain host microbes be used. Microorganism hosts are selected which are known to occupy 
the "phytosphere" (phylloplane, phyllosphere, rhizosphere, and/or rhizoplane) of one or more 
crops of interest. These microorganisms are selected so as to be capable of successfully 
competing in the particular environment (crop and other insect habitats) with the wild-type 
microorganisms, provide for stable maintenance and expression of the gene expressing the 

10 polypeptide pesticide, and, desirably, provide for improved protection of the pesticide from 
environmental degradation and inactivation. 

A large number of microorganisms are known to inhabit the phylloplane (the sur- 
face of the plant leaves) and/or the rhizosphere (the soil surrounding plant roots) of a wide 
variety of important crops. These microorganisms include bacteria, algae, and fungi. Of 

15 particular interest are microorganisms, such as bacteria, e.g., genera Bacillus (including the 
species and subspecies B. thuringiensis kurstaki HD-1, B. thuringiensis kurstaki HD-73, 
B. thuringiensis sotto, B. thuringiensis berliner, B. thuringiensis thuringiensis, 
B. thuringiensis tolworthi, B. thuringiensis dendrolimus, B. thuringiensis alesti, 
B. thuringiensis galleriae, B. thuringiensis aizawai, B. thuringiensis subtoxicus, 

20 B. thuringiensis entomocidus, 5. thuringiensis tenebrionis and B. thuringiensis san diego); 
Pseudomonas, Erwinia, Serratia, Klebsiella, Zanthomonas, Streptomyces, Rhizobium, Rho- 
dopseudomonas, Methylophilius, Agrobacterium, Acetobacter, Lactobacillus, Arthrobacter, 
Azotobacter, Leuconostoc, and Alcaligenes; fungi, particularly yeast, e.g., genera Saccharo- 
myces, Cryptococcus, Kluyveromyces, Sporobolomyces, Rhodotorula, and Aureobasidium. 

25 Of particular interest are such phytosphere bacterial species as Pseudomonas syringae, Pseu- 
domonas fluorescens, Serratia marcescens, Acetobacter xylinum, Agrobacterium tumefaci- 
ens, Rhodobacter sphaeroides, Xanthomonas campestris, Rhizobium melioti f Alcaligenes eu- 
trophus, and Azotobacter vinlandii; and phytosphere yeast species such as Rhodotorula ru- 
bra, R. glutinis, R. marina, R. aurantiaca, Cryptococcus albidus, C diffluens, C. laurentii, 
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Saccharomyces rosei, S. pretoriensis* S. cerevisiae, Sporobolomyces roseus, S. odorus, Kluy- 
veromyces veronae, and Aureobasidium pollulans. 



4.4 Definitions 

5 In accordance with the present invention, nucleic acid sequences include and are 

not limited to DNA (including and not limited to genomic or extragenomic DNA), genes, 
RNA (including and not limited to mRNA and tRNA), nucleosides, and suitable nucleic acid 
segments either obtained from native sources, chemically synthesized, modified, or otherwise 
prepared by the hand of man. The following words and phrases have the meanings set forth 
10 below. 

A, an: In accordance with long standing patent law convention, the words "a" 
and "an" when used in this application, including the claims, denotes "one or more". 

Broad-spectrum: Refers to a wide range of insect species. 

Broad-spectrum activity: The toxicity towards a wide range of insect species. 
15 Expression: The combination of intracellular processes, including transcription 

and translation undergone by a coding DNA molecule such as a structural gene to produce a 
polypeptide. 

Insecticidal activity: The toxicity towards insects. 

Insecticidal specificity: The toxicity exhibited by a crystal protein or proteins, 
20 microbe or plant, towards multiple insect species. 

Intraorder specificity: The toxicity of a particular crystal protein towards insect 
species within an Order of insects (e.g., Order Coleoptera). 

Interorder specificity: The toxicity of a particular crystal protein towards insect 
species of different Orders {e.g., Orders Coleoptera and Diptera). 
25 LC 50 : The lethal concentration of crystal protein that causes 50% mortality of the 

insects treated. 

LC 95 : The lethal concentration of crystal protein that causes 95% mortality of the 
insects treated. 
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Promoter: A recognition site on a DNA sequence or group of DNA sequences 
that provide an expression control element for a structural gene and to which RNA 
polymerase specifically binds and initiates RNA synthesis (transcription) of that gene. 

Regeneration: The process of growing a plant from a plant cell (e.g., plant 
5 protoplast or explant). 

Structural gene: A gene that is expressed to produce a polypeptide. 
Transformation: A process of introducing an exogenous DNA sequence (e.g., a 
vector, a recombinant DNA molecule) into a cell or protoplast in which that exogenous DNA 
is incorporated into a chromosome or is capable of autonomous replication. 
10 Transformed cell: A cell whose DNA has been altered by the introduction of an 

exogenous DNA molecule into that cell. 

Transgenic cell: Any cell derived or regenerated from a transformed cell or 
derived from a transgenic cell. Exemplary transgenic cells include plant calli derived from a 
transformed plant cell and particular cells such as leaf, root, stem, e.g., somatic cells, or 
1 5 reproductive (germ) cells obtained from a transgenic plant. 

Transgenic plant: A plant or progeny thereof derived from a transformed plant 
cell or protoplast, wherein the plant DNA contains an introduced exogenous DNA molecule 
not originally present in a native, non-transgenic plant of the same strain. The terms 
"transgenic plant" and "transformed plant" have sometimes been used in the art as 
20 synonymous terms to define a plant whose DNA contains an exogenous DNA molecule. 
However, it is thought more scientifically correct to refer to a regenerated plant or callus 
obtained from a transformed plant cell or protoplast as being a transgenic plant, and that 
usage will be followed herein. 

Vector: A DNA molecule capable of replication in a host cell and/or to which 
25 another DNA segment can be operatively linked so as to bring about replication of the 
attached segment. A plasmid is an exemplary vector. 

As used herein, the designations "CrylH" and "Cry3" are synonymous, as 
are the designations "CryIIIB2" and u Cry3Bb." Likewise, the inventors have utilized the ge- 
neric term Cry3Bb* to denote any and all Cry3Bb variants which comprise amino acid se- 
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quences modified in the protein. Similarly, cry3Bb* is meant to denote any and all nucleic 
acid segments and/or genes which encode a Cry3Bb* protein, etc. 

4.5 Preparation of cry3* Polynucleotides 

5 Once the structure of the desired peptide to be mutagenized has been analyzed 

using one or more of the design strategies disclosed herein, it will be desirable to introduce 
one or more mutations into either the protein or, alternatively, into the DNA sequence encod- 
ing the protein for the purpose of producing a mutated protein with altered bioinsecticidal 
properties. 

10 To that end, the present invention encompasses both site-specific mutagenesis 

methods and random mutagenesis of a nucleic acid segment encoding a crystal protein in the 
manner described herein. In particular, methods are disclosed for the mutagenesis of nucleic 
acid segments encoding the amino acid sequences using one or more of the design strategies 
described herein. Using the assay methods described herein, one may then identify mutants 

15 arising from these procedures which have improved insecticidal properties or altered speci- 
ficity, either intraorder or interorder. 

The means for mutagenizing a DNA segment encoding a crystal protein are well- 
known to those of skill in the art. Modifications may be made by random, or site-specific 
mutagenesis procedures. The nucleic acid may be modified by altering its structure through 

20 the addition or deletion of one or more nucleotides from the sequence. 

Mutagenesis may be performed in accordance with any of the techniques known 
in the art such as and not limited to synthesizing an oligonucleotide having one or more 
mutations within the sequence of a particular crystal protein. A "suitable host" is any host 
which will express Cry3Bb, such as and not limited to B. thuringiensis and £ colL 

25 Screening for insecticidal activity, in the case of Cry3Bb includes and is not limited to 
coleopteran-toxic activity which may be screened for by techniques known in the art. 

In particular, site-specific mutagenesis is a technique useful in the preparation of 
individual peptides, or biologically fiinctional equivalent proteins or peptides, through 
specific mutagenesis of the underlying DNA. The technique further provides a ready ability 

30 to prepare and test sequence variants, for example, incorporating one or more of the 
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foregoing considerations, by introducing one or more nucleotide sequence changes into the 
DNA. Site-specific mutagenesis allows the production of mutants through the use of specific 
oligonucleotide sequences which encode the DNA sequence of the desired mutation, as well 
as a sufficient number of adjacent nucleotides, to provide a primer sequence of sufficient size 
5 and sequence complexity to form a stable duplex on both sides of the deletion junction being 
traversed. Typically, a primer of about 17 to about 75 nucleotides or more in length is 
preferred, with about 10 to about 25 or more residues on both sides of the junction of the 
sequence being altered. 

In general, the technique of site-specific mutagenesis is well known in the art, as 

10 exemplified by various publications. As will be appreciated, the technique typically employs 
a phage vector which exists in both a single stranded and double stranded form. Typical 
vectors useful in site-directed mutagenesis include vectors such as the Ml 3 phage. These 
phage are readily commercially available and their use is generally well known to those 
skilled in the art. Double stranded plasmids are also routinely employed in site directed 

1 5 mutagenesis which eliminates the step of transferring the gene of interest from a plasmid to a 
phage. 

In general, site-directed mutagenesis in accordance herewith is performed by first 
obtaining a single-stranded vector or melting apart of two strands of a double stranded vector 
which includes within its sequence a DNA sequence which encodes the desired peptide. An 

20 oligonucleotide primer bearing the desired mutated sequence is prepared, generally syntheti- 
cally. This primer is then annealed with the single-stranded vector, and subjected to DNA 
polymerizing enzymes such as E. coli polymerase I Klenow fragment, in order to complete 
the synthesis of the mutation-bearing strand. Thus, a heteroduplex is formed wherein one 
strand encodes the original non-mutated sequence and the second strand bears the desired 

25 mutation. This heteroduplex vector is then used to transform or transfect appropriate cells, 
such as E. coli cells, and clones are selected which include recombinant vectors bearing the 
mutated sequence arrangement. A genetic selection scheme was devised by Kunkel et al 
(1987) to enrich for clones incorporating the mutagenic oligonucleotide. Alternatively, the 
use of PCR™ with commercially available thermostable enzymes such as Taq polymerase 

30 may be used to incorporate a mutagenic oligonucleotide primer into an amplified DNA frag- 
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ment that can then be cloned into an appropriate cloning or expression vector. The PCR™- 
mediated mutagenesis procedures of Tomic et ai (1990) and Upender et al (1995) provide 
two examples of such protocols. A PCR™ employing a thermostable ligase in addition to a 
thermostable polymerase may also be used to incorporate a phosphorylated mutagenic oli- 
5 gonucleotide into an amplified DNA fragment that may then be cloned into an appropriate 
cloning or expression vector. The mutagenesis procedure described by Michael (1994) pro- 
vides an example of one such protocol. 

The preparation of sequence variants of the selected peptide-encoding DNA seg- 
ments using site-directed mutagenesis is provided as a means of producing potentially useful 

10 species and is not meant to be limiting as there are other ways in which sequence variants of 
peptides and the DNA sequences encoding them may be obtained. For example, recombinant 
vectors encoding the desired peptide sequence may be treated with mutagenic agents, such as 
hydroxylamine, to obtain sequence variants. 

As used herein, the term "oligonucleotide directed mutagenesis procedure" refers 

1 5 to template-dependent processes and vector-mediated propagation which result in an increase 
in the concentration of a specific nucleic acid molecule relative to its initial concentration, or 
in an increase in the concentration of a detectable signal, such as amplification. As used 
herein, the term "oligonucleotide directed mutagenesis procedure" is intended to refer to a 
process that involves the template-dependent extension of a primer molecule. The term 

20 template dependent process refers to nucleic acid synthesis of an RNA or a DNA molecule 
wherein the sequence of the newly synthesized strand of nucleic acid is dictated by the 
well-known rules of complementary base pairing (see, for example, Watson, 1987). Typi- 
cally, vector mediated methodologies involve the introduction of the nucleic acid fragment 
into a DNA or RNA vector, the clonal amplification of the vector, and the recovery of the 

25 amplified nucleic acid fragment. Examples of such methodologies are provided by U. S. Pat- 
ent 4,237,224, specifically incorporated herein by reference in its entirety 

A number of template dependent processes are available to amplify the target se- 
quences of interest present in a sample. One of the best known amplification methods is the 
polymerase chain reaction (PCR™) which is described in detail in U. S. Patents 4,683,195, 

30 4,683,202 and 4,800,1 59 (each of which is specifically incorporated herein by reference in its 
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entirety). Briefly, in PCR™, two primer sequences are prepared which are complementary to 
regions on opposite complementary strands of the target sequence. An excess of deoxynu- 
cleoside triphosphates are added to a reaction mixture along with a DNA polymerase (e.g., 
Tag polymerase). If the target sequence is present in a sample, the primers will bind to the 
5 target and the polymerase will cause the primers to be extended along the target sequence by 
adding on nucleotides. By raising and lowering the temperature of the reaction mixture, the 
extended primers will dissociate from the target to form reaction products, excess primers 
will bind to the target and to the reaction products and the process is repeated. Preferably a 
reverse transcriptase PCR™ amplification procedure may be performed in order to quantify 
10 the amount of mRNA amplified. Polymerase chain reaction methodologies are well known 
in the art. 

Another method for amplification is the ligase chain reaction (referred to as LCR), 
disclosed in Eur. Pat. Appl. Publ. No. 320,308, incorporated herein by reference in its en- 
tirety. In LCR, two complementary probe pairs are prepared, and in the presence of the target 

15 sequence, each pair will bind to opposite complementary strands of the target such that they 
abut. In the presence of a ligase, the two probe pairs will link to form a single unit. By tem- 
perature cycling, as in PCR™, bound ligated units dissociate from the target and then serve as 
"target sequences" for ligation of excess probe pairs. U. S. Patent 4,883,750, specifically in- 
corporated herein by reference in its entirety, describes an alternative method of amplification 

20 similar to LCR for binding probe pairs to a target sequence. 

Qbeta Replicase™, described in Intl. Pat. Appl. Publ. No. PCT/US87/00880, in- 
corporated herein by reference in its entirety, may also be used as still another amplification 
method in the present invention. In this method, a replicative sequence of RNA which has a 
region complementary to that of a target is added to a sample in the presence of an RNA po- 

25 lymerase. The polymerase will copy the replicative sequence which can then be detected. 

An isothermal amplification method, in which restriction endonucleases and li- 
gases are used to achieve the amplification of target molecules that contain nucleotide 
5'-[a-thio]triphosphates in one strand of a restriction site (Walker et ai, 1992, incorporated 
herein by reference in its entirety), may also be useful in the amplification of nucleic acids in 

30 the present invention. 



A: U5535<2WXVOJiDOC) 



-70- 



Strand Displacement Amplification (SDA) is another method of carrying out iso- 
thermal amplification of nucleic acids which involves multiple rounds of strand displacement 
and synthesis, i.e., nick translation. A similar method, called Repair Chain Reaction (RCR) 
is another method of amplification which may be useful in the present invention and is in- 
5 volves annealing several probes throughout a region targeted for amplification, followed by a 
repair reaction in which only two of the four bases are present. The other two bases can be 
added as biotinylated derivatives for easy detection. A similar approach is used in SDA 

Sequences can also be detected using a cyclic probe reaction (CPR). In CPR, a 
probe having 3' and 5' end sequences of non-Cry-specific DNA and an internal sequence of a 

10 Cry-specific RNA is hybridized to DNA which is present in a sample. Upon hybridization, 
the reaction is treated with RNaseH, and the products of the probe identified as distinctive 
products generating a signal which are released after digestion. The original template is an- 
nealed to another cycling probe and the reaction is repeated. Thus, CPR involves amplifying 
a signal generated by hybridization of a probe to a cry-specific expressed nucleic acid 

15 Still other amplification methods described in Great Britain Pat. Appl. No. 2 202 

328, and in Intl. Pat. Appl. Publ. No. PCT/US89/01025, each of which is incorporated herein 
by reference in its entirety, may be used in accordance with the present invention. In the 
former application, "modified" primers are used in a PCR™ like, template and enzyme de- 
pendent synthesis. The primers may be modified by labeling with a capture moiety (e.g., 

20 biotin) and/or a detector moiety (e.g., enzyme). In the latter application, an excess of labeled 
probes are added to a sample. In the presence of the target sequence, the probe binds and is 
cleaved catalytically. After cleavage, the target sequence is released intact to be bound by 
excess probe. Cleavage of the labeled probe signals the presence of the target sequence 

Other nucleic acid amplification procedures include transcription-based amplifi- 

25 cation systems (TAS) (Kwoh et al, 1989; Intl. Pat. Appl. Publ. No. WO 88/10315, incorpo- 
rated herein by reference in its entirety), including nucleic acid sequence based amplification 
(NASBA) and 3SR. In NASBA, the nucleic acids can be prepared for amplification by stan- 
dard phenol/chloroform extraction, heat denaturation of a sample, treatment with lysis buffer 
and minispin columns for isolation of DNA and RNA or guanidinium chloride extraction of 

30 RNA. These amplification techniques involve annealing a primer which has crystal protein- 
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specific sequences. Following polymerization, DNA/RNA hybrids are digested with RNase 
H while double stranded DNA molecules are heat denatured again. In either case the single 
stranded DNA is made fully double stranded by addition of second crystal protein-specific 
primer, followed by polymerization. The double stranded DNA molecules are then multiply 
5 transcribed by a polymerase such as T7 or SP6. In an isothermal cyclic reaction, the RNAs 
are reverse transcribed into double stranded DNA, and transcribed once against with a po- 
lymerase such as T7 or SP6. The resulting products, whether truncated or complete, indicate 
crystal protein-specific sequences. 

Eur. Pat. Appl. Publ. No. 329,822, incorporated herein by reference in its entirety, 

10 disclose a nucleic acid amplification process involving cyclically synthesizing sin- 
gle-stranded RNA ("ssRNA"), ssDNA, and double-stranded DNA (dsDNA), which may be 
used in accordance with the present invention. The ssRNA is a first template for a first 
primer oligonucleotide, which is elongated by reverse transcriptase (RNA-dependent DNA 
polymerase). The RNA is then removed from resulting DNA;RNA duplex by the action of 

1 5 ribonuclease H (RNase H, an RNase specific for RNA in a duplex with either DNA or RNA). 
The resultant ssDNA is a second template for a second primer, which also includes the se- 
quences of an RNA polymerase promoter (exemplified by T7 RNA polymerase) 5' to its ho- 
mology to its template. This primer is then extended by DNA polymerase (exemplified by 
the large "Klenow" fragment of E. coli DNA polymerase I), resulting as a double-stranded 

20 DNA ("dsDNA") molecule, having a sequence identical to that of the original RNA between 
the primers and having additionally, at one end, a promoter sequence. This promoter se- 
quence can be used by the appropriate RNA polymerase to make many RNA copies of the 
DNA. These copies can then re-enter the cycle leading to very swift amplification. With 
proper choice of enzymes, this amplification can be done isothermally without addition of 

25 enzymes at each cycle. Because of the cyclical nature of this process, the starting sequence 
can be chosen to be in the form of either DNA or RNA 

Intl. Pat. Appl. Publ. No. WO 89/06700, incorporated herein by reference in its 
entirety, disclose a nucleic acid sequence amplification scheme based on the hybridization of 
a promoter/primer sequence to a target single-stranded DNA ("ssDNA' 1 ) followed by tran- 

30 scription of many RNA copies of the sequence. This scheme is not cyclic; i.e., new templates 
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are not produced from the resultant RNA transcripts. Other amplification methods include 
"RACE" (Frohman, 1990), and n one-sided PCR™ M (Ohara, 1989) which are well-known to 
those of skill in the art. 

Methods based on ligation of two (or more) oligonucleotides in the presence of 
5 nucleic acid having the sequence of the resulting "di-oligonucleotide", thereby amplifying the 
di-oligonucleotide (Wu and Dean, 1996, incorporated herein by reference in its entirety), may 
also be used in the amplification of DNA sequences of the present invention. 

4.6 Phage-Resistant Variants 

1 0 In certain embodiments, one may desired to prepare one or more phage resistant 

variants of the B. thuringiensis mutants prepared by the methods described herein. To do so, 
an aliquot of a phage lysate is spread onto nutrient agar and allowed to dry. An aliquot of the 
phage sensitive bacterial strain is then plated directly over the dried lysate and allowed to dry. 
The plates are incubated at 30°C. The plates are incubated for 2 days and, at that time, nu- 

1 5 merous colonies could be seen growing on the agar. Some of these colonies are picked and 
subcultured onto nutrient agar plates. These apparent resistant cultures are tested for resis- 
tance by cross streaking with the phage lysate. A line of the phage lysate is streaked on the 
plate and allowed to dry. The presumptive resistant cultures are then streaked across the 
phage line. Resistant bacterial cultures show no lysis anywhere in the streak across the phage 

20 line after overnight incubation at 30°C. The resistance to phage is then reconfirmed by plat- 
ing a lawn of the resistant culture onto a nutrient agar plate. The sensitive strain is also 
plated in the same manner to serve as the positive control. After drying, a drop of the phage 
lysate is plated in the center of the plate and allowed to dry. Resistant cultures showed no 
lysis in the area where the phage lysate has been placed after incubation at 30°C for 24 hours. 

25 

4.7 Crystal Protein Compositions As Insecticides and Methods of Use 
Order Coleoptera comprises numerous beetle species including ground beetles, 

reticulated beetles, skin and larder beetles, long-horned beetles, leaf beetles, weevils, bark 
beetles, ladybird beetles, soldier beetles, stag beetles, water scavenger beetles, and a host of 
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other beetles. A brief taxonomy of the Order is given at the website 
http://www.ncbi.nlm.nih.gov/Taxonomy/tax.html. 

Particularly important among the Coleoptera are the agricultural pests included 
within the infraorders Chrysomeliformia and Cucujiformia. Members of the infraorder Chry- 
5 someliformia, including the leaf beetles {Chrysomelidae) and the weevils (Curculionidae), 
are particularly problematic to agriculture, and are responsible for a variety of insect damage 
to crops and plants. The infraorder Cucujiformia includes the families Coccinellidae, Cucu- 
jidae, Lagridae, Meloidae, Rhipiphoridae, and Tenebrionidae. Within this infraorder, mem- 
bers of the family Chrysomelidae (which includes the genera Exema, Chrysomela, Oreina, 

1 0 Chrysolina, Leptinotarsa, Gonioctena, Oulema, Monozia, Ophraella, Cerotoma, Diabrotica, 
and Lachnaia), are well-known for their potential to destroy agricultural crops. 

As the toxins of the present invention have been shown to be effective in combat- 
ting a variety of members of the order Coleoptera, the inventors contemplate that the insects 
of many Coleopteran genera may be controlled or eradicated using the polypeptide composi- 

15 tions described herein. Likewise, the methods described herein for generating modified 
polypeptides having enhanced insect specificity may also be useful in extending the range of 
the insecticidal activity of the modified polypeptides to other insect species within, and out- 
side of, the Order Coleoptera. 

As such, the inventors contemplate that the crystal protein compositions disclosed 

20 herein will find particular utility as insecticides for topical and/or systemic application to 
field crops, including but not limited to rice, wheat, alfalfa, corn (maize), soybeans, tobacco, 
potato, barley, canola (rapeseed), sugarbeet, sugarcane, flax, rye, oats, cotton, sunflower; 
grasses, such as pasture and turf grasses; fruits, citrus, nuts, trees, shrubs and vegetables; as 
well as ornamental plants, cacti, succulents, and the like. 

25 Disclosed and claimed is a composition comprising an insecticidally-effective 

amount of a Cry3Bb* crystal protein composition. The composition preferably comprises the 
amino acid sequence of SEQ ID N0:2, SEQ ID N0:4, SEQ ID N0:6, SEQ ID N0:8, SEQ ID 
NO:10, SEQ ID NO: 12, SEQ ID N0:14, SEQ ID N0:16, SEQ ID NO:18, SEQ ID NO:20, 
SEQ ID N0:22, SEQ ID N0:24, SEQ ID N0:26, SEQ ID NO:28, SEQ ID NO:30, SEQ ID 

30 NO:32, SEQ ID NO:34, SEQ ID NO:36, SEQ ID NO:38, SEQ ID NO:40, SEQ ID NO:42, 



A: I355J5(2WKV01' DOC) 



-74- 



SEQ ID NO:44, SEQ ID NO:46, SEQ ID NO:48, SEQ ID NO:50. SEQ ID NO:52, SEQ ID 
NO:54, SEQ ID NO:56, SEQ ID NO:58, SEQ ID NO:60, SEQ ID NO:62, SEQ ID NO:64, 
SEQ ID NO:66, SEQ ID NO:68, SEQ ID NO:70, SEQ ID NO: 1 00, or SEQ ID NO: 108 or 
biologically-functional equivalents thereof. 
5 The insecticide composition may also comprise a Cry3Bb* crystal protein that is 

encoded by a nucleic acid sequence having the sequence of SEQ ID NO:l, SEQ ID NO:3, 
SEQ ID NO:5. SEQ ID NO:7, SEQ ID NO:9, SEQ ID NO: 1 1 , SEQ ID NO: 1 3. SEQ ID NO: 1 5, 
SEQ ID NO: 17, SEQ ID NO: 19, SEQ ID NO:21, SEQ ID NO:23, SEQ ID NO:25, SEQ ID 
NO:27, SEQ ID NO:29, SEQ ID NO:31, SEQ ID NO:33, SEQ ID NO:35, SEQ ID NO:37, 

10 SEQ ID NO:39, SEQ ID NO:41, SEQ ID NO:43, SEQ ID NO:45, SEQ ID NO:47, SEQ ID 
NO:49, SEQ ID NQ:51, SEQ ID NO:53, SEQ ID NO:55, SEQ ID NO:57, SEQ ID NO:59, 
SEQ ID NO:61, SEQ ID NO:63, SEQ ID NO:65, SEQ ID NO:67, SEQ ID NO:69, SEQ ID 
NO:99, or SEQ ID NO: 108, or, alternatively, a nucleic acid sequence which hybridizes to the 
nucleic acid sequence of SEQ ID NO: 1, SEQ ID NO:3, SEQ ID NO:5. SEQ ID NO:7, SEQ ID 

15 NO:9, SEQ ID NO: 1 1 , SEQ ID NO: 1 3 . SEQ ID NO: 1 5, SEQ ID NO: 1 7, SEQ ID NO: 1 9, SEQ 
ID NO:21, SEQ ID NO:23, SEQ ID NO:25, SEQ ID NO:27, SEQ ID NO:29, SEQ ID NO:31, 
SEQ ID NO:33, SEQ ID NO:35, SEQ ID NO:37, SEQ ID NO:39, SEQ ID NO:41, SEQ ID 
NO:43, SEQ ID NO:45, SEQ ID NO:47, SEQ ID NO:49, SEQ ID NO:51, SEQ ID NO:53, 
SEQ ID NO:55, SEQ ID NO:57, SEQ ID NO:59, SEQ ID NO:61, SEQ ID NO:63, SEQ ID 

20 NO:65, SEQ ID NO:67, SEQ ID NO:69, SEQ ID NO:99, or SEQ ID NO: 107 under conditions 
of moderate stringency. 

The insecticidal compositions may comprise one or more B. thuringiensis cell 
types, or one or more cultures of such cells, or, alternatively, a mixture of one or more B. 
thuringiensis cells which express one or more of the novel crystal proteins of the invention in 

25 combination with another insecticidal composition. In certain aspects it may be desirable to 
prepare compositions which contain a plurality of crystal proteins, either native or modified, 
for treatment of one or more types of susceptible insects. The B. thuringiensis cells of the in- 
vention can be treated prior to formulation to prolong the insecticidal activity when the cells 
are applied to the environment of the target insect(s). Such treatment can be by chemical or 

30 physical means, or by a combination of chemical and/or physical means, so long as the tech- 
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nique does not deleteriously affect the properties of the insecticide, nor diminish the cellular 
capability in protecting the insecticide. Examples of chemical reagents are halogenerating 
agents, particularly halogens of atomic no. 17-80. More particularly, iodine can be used un- 
der mild conditions and for sufficient time to achieve the desired results. Other suitable 
5 techniques include treatment with aldehydes, such as formaldehyde and glutaraldehyde; anti- 
infectives, such as zephiran chloride; alcohols, such as isopropyl and ethanol; various his- 
tologic fixatives, such as Bouin's fixative and Helly's fixative (see Humason, 1967); or a 
combination of physical (heat) and chemical agents that prolong the activity of the 
5-endotoxin produced in the cell when the cell is applied to the environment of the target 

10 pest(s). Examples of physical means are short wavelength radiation such as gamma-radiation 
and X-radiation, freezing, UV irradiation, lyophilization, and the like, 

The inventors contemplate that any formulation methods known to those of skill 
in the art may be employed using the proteins disclosed herein to prepare such bioinsecticide 
compositions. It may be desirable to formulate whole cell preparations, cell extracts, cell 

15 suspensions, cell homogenates, cell lysates, cell supernatants, cell filtrates, or cell pellets of a 
cell culture (preferably a bacterial cell culture such as a B. thuringiensis cell culture described 
in Table 3) that expresses one or more cry3Bb* DNA segments to produce the encoded 
Cry3Bb* protein(s) or peptide(s). The methods for preparing such formulations are known to 
those of skill in the art, and may include, e.g., desiccation, lyophilization, homogenization, 

20 extraction, filtration, centrifugation, sedimentation, or concentration of one or more cultures 
of bacterial cells, such as B. thuringiensis cells described in Table 3, which express the 
Cry3Bb* peptide(s) of interest. 

In one preferred embodiment, the bioinsecticide composition comprises an oil 
flowable suspension comprising lysed or unlysed bacterial cells, spores, or crystals which 

25 contain one or more of the novel crystal proteins disclosed herein. Preferably the cells are B. 
thuringiensis cells, however, any such bacterial host cell expressing the novel nucleic acid 
segments disclosed herein and producing a crystal protein is contemplated to be useful, such 
as Bacillus spp., including B. megaterium, B. subtilis; B. cereus f Escherichia spp., including 
E. coliy and/or Pseudomonas spp., including P. cepacia, P. aeruginosa^ and P. fluorescens. 

30 Alternatively, the oil flowable suspension may consist of a combination of one or more of the 
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following compositions: lysed or unlysed bacterial cells, spores, crystals, and/or purified 
crystal proteins. 

In a second preferred embodiment, the bioinsecticide composition comprises a 
water dispersible granule or powder. This granule or powder may comprise lysed or unlysed 
5 bacterial cells, spores, or crystals which contain one or more of the novel crystal proteins 
disclosed herein. Preferred sources for these compositions include bacterial cells such as B. 
thuringiensis cells, however, bacteria of the genera Bacillus, Escherichia, and Pseudomonas 
which have been transformed with a DNA segment disclosed herein and expressing the crys- 
tal protein are also contemplated to be useful. Alternatively, the granule or powder may 

10 consist of a combination of one or more of the following compositions: lysed or unlysed 
bacterial cells, spores, crystals, and/or purified crystal proteins. 

In a third important embodiment, the bioinsecticide composition comprises a 
wettable powder, spray, emulsion, colloid, aqueous or organic solution, dust, pellet, or col- 
lodial concentrate. Such a composition may contain either unlysed or lysed bacterial cells, 

1 5 spores, crystals, or cell extracts as described above, which contain one or more of the novel 
crystal proteins disclosed herein. Preferred bacterial cells are B. thuringiensis cells, however, 
bacteria such as B. megaterium, B. subtilis, B. cereus> E. coli, or Pseudomonas spp. cells 
transformed with a DNA segment disclosed herein and expressing the crystal protein are also 
contemplated to be useful. Such dry forms of the insecticidal compositions may be formu- 

20 lated to dissolve immediately upon wetting, or alternatively, dissolve in a controlled-release, 
sustained-release, or other time-dependent manner. Alternatively, such a composition may 
consist of a combination of one or more of the following compositions: lysed or unlysed 
bacterial cells, spores, crystals, and/or purified crystal proteins. 

In a fourth important embodiment, the bioinsecticide composition comprises an 

25 aqueous solution or suspension or cell culture of lysed or unlysed bacterial cells, spores, 
crystals, or a mixture of lysed or unlysed bacterial cells, spores, and/or crystals, such as those 
described above which contain one or more of the novel crystal proteins disclosed herein. 
Such aqueous solutions or suspensions may be provided as a concentrated stock solution 
which is diluted prior to application, or alternatively, as a diluted solution ready-to-apply. 
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For these methods involving application of bacterial cells, the cellular host con- 
taining the Crystal protein gene(s) may be grown in any convenient nutrient medium, where 
the DNA construct provides a selective advantage, providing for a selective medium so that 
substantially all or all of the cells retain the B. thuringiensis gene. These cells may then be 
5 harvested in accordance with conventional ways. Alternatively, the cells can be treated prior 
to harvesting. 

When the insecticidal compositions comprise B. thuringiensis cells, spores, and/or 
crystals containing the modified crystal protein(s) of interest, such compositions may be for- 
mulated in a variety of ways. They may be employed as wettable powders, granules or dusts, 

10 by mixing with various inert materials, such as inorganic minerals (phyllosilicates, carbon- 
ates, sulfates, phosphates, and the like) or botanical materials (powdered corncobs, rice hulls, 
walnut shells, and the like). The formulations may include spreader-sticker adjuvants, stabi- 
lizing agents, other pesticidal additives, or surfactants. Liquid formulations may be aqueous- 
based or non-aqueous and employed as foams, suspensions, emulsifiable concentrates, or the 

1 5 like. The ingredients may include rheological agents, surfactants, emulsifiers, dispersants, or 
polymers. 

Alternatively, the novel Cry3Bb-derived mutated crystal proteins may be prepared 
by native or recombinant bacterial expression systems in vitro and isolated for subsequent 
field application. Such protein may be either in crude cell lysates, suspensions, colloids, e/c, 

20 or alternatively may be purified, refined, buffered, and/or further processed, before formulat- 
ing in an active biocidal formulation. Likewise, under certain circumstances, it may be desir- 
able to isolate crystals and/or spores from bacterial cultures expressing the crystal protein and 
apply solutions, suspensions, or collodial preparations of such crystals and/or spores as the 
active bioinsecticidal composition. 

25 Another important aspect of the invention is a method of controlling coleopteran 

insects which are susceptible to the novel compositions disclosed herein. Such a method 
generally comprises contacting the insect or insect population, colony, etc., with an insectici- 
dally-effective amount of a Cry3Bb* crystal protein composition. The method may utilize 
Cry3Bb* crystal proteins such as those disclosed in SEQ ID NO:2, SEQ ID NO:4, SEQ ID 

30 NO:6, SEQ ID NO:8, SEQ ID NO: 10, SEQ ID NO:12, SEQ ID NO:14, SEQ ID NO:16, SEQ 
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ID NO: 18, SEQ ID NO:20, SEQ ID NO:22, SEQ ID NO:24, SEQ ID NO:26, SEQ ID NO:28, 
SEQ ID NO:30, SEQ ID NO:32, SEQ ID NO:34, SEQ ID NO:36, SEQ ID NO:38, SEQ ID 
NO:40, SEQ ID NO:42, SEQ ID NO:44, SEQ ID NO:46, SEQ ID NO:48, SEQ ID NO:50, 
SEQ ID NO:52, SEQ ID NO:54, SEQ ID NO:56, SEQ ID NO:58, SEQ ID NO:60, SEQ ID 
5 NO:62, SEQ ID NO:64, SEQ ID NO:66, SEQ ID NO:68, SEQ ID NO:70, SEQ ID NO: 100, or 
SEQ ID NO: 108, or biologically functional equivalents thereof. 

Alternatively, the method may utilize one or more Cry3Bb* crystal proteins 
which are encoded by the nucleic acid sequences of SEQ ID NO:l, SEQ ID NO:3, SEQ ID 
NO:5. SEQ ID NO:7, SEQ ID NO:9, SEQ ID NO: 1 1, SEQ ID NO: 13. SEQ ID NO: 15, SEQ ID 

10 NO:17, SEQ ID NO:19, SEQ ID NO:21, SEQ ID NO:23, SEQ ID NO:25, SEQ ID NO:27, 
SEQ ID NO:29, SEQ ID NO:31, SEQ ID NO:33, SEQ ID NO:35, SEQ ID NO:37, SEQ ID 
NO:39, SEQ ID NO:41, SEQ ID NO:43, SEQ ID NO:45, SEQ ID NO:47, SEQ ID NO:49, 
SEQ ID NO:51, SEQ ID NO:53, SEQ ID NO:55, SEQ ID N0.57, SEQ ID NO:59, SEQ ID 
NO:61, SEQ ID NO:63, SEQ ID NO:65, SEQ ID NO:67, SEQ ID NO:69, SEQ ID NO:99, 

15 SEQ ID NO:101, or SEQ ID NO:107, or by one or more nucleic acid sequences which hy- 
bridize to the sequences of SEQ ID NO:l, SEQ ID NO:3, SEQ ID NO:5. SEQ ID NO:7, SEQ 
ID NO:9, SEQ ID NO:l 1, SEQ ID NO:13. SEQ ID N0.15, SEQ ID NO:17, SEQ ID NO:19, 
SEQ ID NO:21, SEQ ID NO:23, SEQ ID NO:25, SEQ ID NO:27, SEQ ID NO:29, SEQ ID 
NO:31, SEQ ID NO:33, SEQ ID NO:35, SEQ ID NO:37, SEQ ID NO:39, SEQ ID NO:41, 

20 SEQ ID NO:43, SEQ ID NO:45, SEQ ID NO:47, SEQ ID NO:49, SEQ ID NO:51, SEQ ID 
NO:53, SEQ ID NO:55, SEQ ID NO:57, SEQ ID NO:59, SEQ ID NO:61, SEQ ID NO:63, 
SEQ ID NO:65, SEQ ID NO:67, SEQ ID NO:69, SEQ ID NO:99, SEQ ID NO: 101, or SEQ ID 
NO: 107, under conditions of moderate, or higher, stringency. The methods for identifying 
sequences which hybridize to those disclosed under conditions of moderate or higher strin- 

25 gency are well-known to those of skill in the art, and are discussed herein. 

Regardless of the method of application, the amount of the active component(s) 
are applied at an insecticidally-effective amount, which will vary depending on such factors 
as, for example, the specific coleopteran insects to be controlled, the specific plant or crop to 
be treated, the environmental conditions, and the method, rate, and quantity of application of 

30 the insecticidally-active composition. 
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The insecticide compositions described may be made by formulating either the 
bacterial cell, crystal and/or spore suspension, or isolated protein component with the desired 
agriculturally-acceptable carrier. The compositions may be formulated prior to administra- 
tion in an appropriate means such as lyophilized, freeze-dried, dessicated, or in an aqueous 
5 carrier, medium or suitable diluent, such as saline or other buffer. The formulated composi- 
tions may be in the form of a dust or granular material, or a suspension in oil (vegetable or 
mineral), or water or oil/water emulsions, or as a wettable powder, or in combination with 
any other carrier material suitable for agricultural application. Suitable agricultural carriers 
can be solid or liquid and are well known in the art. The term "agriculturally-acceptable car- 

10 rier" covers all adjuvants, e.g., inert components, dispersants, surfactants, tackifiers, binders, 
etc, that are ordinarily used in insecticide formulation technology; these are well known to 
those skilled in insecticide formulation. The formulations may be mixed with one or more 
solid or liquid adjuvants and prepared by various means, e.g., by homogeneously mixing, 
blending and/or grinding the insecticidal composition with suitable adjuvants using conven- 

1 5 tional formulation techniques. 

The insecticidal compositions of this invention are applied to the environment of 
the target coleopteran insect, typically onto the foliage of the plant or crop to be protected, by 
conventional methods, preferably by spraying. The strength and duration of insecticidal ap- 
plication will be set with regard to conditions specific to the particular pest(s), crop(s) to be 

20 treated and particular environmental conditions. The proportional ratio of active ingredient 
to carrier will naturally depend on the chemical nature, solubility, and stability of the insec- 
ticidal composition, as well as the particular formulation contemplated. 

Other application techniques, e.g., dusting, sprinkling, soaking, soil injection, soil 
tilling, seed coating, seedling coating, spraying, aerating, misting, atomizing, and the like, are 

25 also feasible and may be required under certain circumstances such as e.g., insects that cause 
root or stalk infestation, or for application to delicate vegetation or ornamental plants. These 
application procedures are also well-known to those of skill in the art. 

The insecticidal composition of the invention may be employed in the method of 
the invention singly or in combination with other compounds, including and not limited to 

30 other pesticides. The method of the invention may also be used in conjunction with other 
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treatments such as surfactants, detergents, polymers or time-release formulations. The insec- 
ticidal compositions of the present invention may be formulated for either systemic or topical 
use. 

The concentration of insecticidal composition which is used for environmental, 
5 systemic, or foliar application will vary widely depending upon the nature of the particular 
formulation, means of application, environmental conditions, and degree of biocidal activity. 
Typically, the bioinsecticidal composition will be present in the applied formulation at a con- 
centration of at least about 1% by weight and may be up to and including about 99% by 
weight. Dry formulations of the compositions may be from about 1% to about 99% or more 
10 by weight of the composition, while liquid formulations may generally comprise from about 
1% to about 99% or more of the active ingredient by weight. Formulations which comprise 
intact bacterial cells will generally contain from about 10 4 to about 10 12 cells/mg 

The insecticidal formulation may be administered to a particular plant or target 
area in one or more applications as needed, with a typical field application rate per hectare 
15 ranging on the order of from about 1 g to about 1 kg, 2 kg, 5, kg, or more of active ingredient. 

4.8 Nucleic Acid Segments as Hybridization Probes and Primers 

In addition to their use in directing the expression of crystal proteins or peptides 
of the present invention, the nucleic acid sequences contemplated herein also have a variety 

20 of other uses. For example, they also have utility as probes or primers in nucleic acid 
hybridization embodiments. As such, it is contemplated that nucleic acid segments that 
comprise a sequence region that consists of at least a 14 nucleotide long contiguous sequence 
that has the same sequence as, or is complementary to, a 14 nucleotide long contiguous DNA 
segment of SEQ ID NO:l, SEQ ID NO:3, SEQ ID NO:5. SEQ ID NO:7, SEQ ID NO:9, SEQ 

25 ID NO:l 1, SEQ ID NO: 13. SEQ ID NO: 15, SEQ ID NO: 17, SEQ ID NO: 19, SEQ ID NO:21, 
SEQ ID NO:23, SEQ ID NO:25, SEQ ID NO:27, SEQ ID NO:29, SEQ ID NO:31, SEQ ID 
NO:33, SEQ ID NO:35, SEQ ID NO:37, SEQ ID NO:39, SEQ ID NO:41, SEQ ID NO:43, 
SEQ ID NO:45, SEQ ID NO:47, SEQ ID NO:49, SEQ ID NO:51, SEQ ID NO:53, SEQ ID 
NO:55, SEQ ID NO:57, SEQ ID NO:59, SEQ ID NO:61, SEQ ID NO:63, SEQ ID NO:65, 

30 SEQ ID NO:67, SEQ ID NO:69, SEQ ID NO:99, SEQ ID NO: 101, or SEQ ID NO: 107 will 
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find particular utility. Longer contiguous identical or complementary sequences, e.g., those 
of about 20, 30, 40, 50, 100, 200, 500, 1000, 2000, 5000, 10000 etc. (including all 
intermediate lengths and up to and including full-length sequences will also be of use in 
certain embodiments. 

5 The ability of such nucleic acid probes to specifically hybridize to crystal protein- 

encoding sequences will enable them to be of use in detecting the presence of complementary 
sequences in a given sample. However, other uses are envisioned, including the use of the 
sequence information for the preparation of mutant species primers, or primers for use in 
preparing other genetic constructions. 

10 Nucleic acid molecules having sequence regions consisting of contiguous 

nucleotide stretches of 10-14, 15-20, 30, 50, or even of 100-200 nucleotides or so, identical 
or complementary to DNA sequences of SEQ ID NO:l, SEQ ID NO:3, SEQ ID NO:5. SEQ 
ID NO:7, SEQ ID NO:9, SEQ ID NO:ll, SEQ ID NO:13. SEQ ID NO:15, SEQ ID NO: 17, 
SEQ ID NO:19, SEQ ID NO:21, SEQ ID NO:23, SEQ ID NO:25, SEQ ID NO:27, SEQ ID 

15 NO:29, SEQ ID NO:31, SEQ ID NO:33, SEQ ID NO:35, SEQ ID NO:37, SEQ ID NO:39, 
SEQ ID NO:41, SEQ ID NO:43, SEQ ID NO:45, SEQ ID NO:47, SEQ ID NO:49, SEQ ID 
NO:51, SEQ ID NO:53, SEQ ID NO:55, SEQ ID NO:57, SEQ ID NO:59, SEQ ID NO:61, 
SEQ ID NO:63, SEQ ID NO:65, SEQ ID NO:67, SEQ ID NO:69, SEQ ID NO:99, SEQ ID 
NO: 101, or SEQ ID NO: 107 are particularly contemplated as hybridization probes for use in, 

20 e.g., Southern and Northern blotting. Smaller fragments will generally find use in 
hybridization embodiments, wherein the length of the contiguous complementary region may 
be varied, such as between about 10-14 and about 100 or 200 nucleotides, but larger 
contiguous complementary stretches may be used, according to the length complementary 
sequences one wishes to detect. 

25 The use of a hybridization probe of about 14 nucleotides in length allows the 

formation of a duplex molecule that is both stable and selective. Molecules having 
contiguous complementary sequences over stretches greater than 14 bases in length are 
generally preferred, though, in order to increase stability and selectivity of the hybrid, and 
thereby improve the quality and degree of specific hybrid molecules obtained. One will 
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generally prefer to design nucleic acid molecules having gene-complementary stretches of 15 
to 20 contiguous nucleotides, or even longer where desired. 

Of course, fragments may also be obtained by other techniques such as, e.g., by 
mechanical shearing or by restriction enzyme digestion. Small nucleic acid segments or 
5 fragments may be readily prepared by, for example, directly synthesizing the fragment by 
chemical means, as is commonly practiced using an automated oligonucleotide synthesizer. 
Also, fragments may be obtained by application of nucleic acid reproduction technology, 
such as the PCR™ technology of U. S. Patents 4,683,195 and 4,683,202 (each incorporated 
herein by reference), by introducing selected sequences into recombinant vectors for 

10 recombinant production, and by other recombinant DNA techniques generally known to 
those of skill in the art of molecular biology. 

Accordingly, the nucleotide sequences of the invention may be used for their 
ability to selectively form duplex molecules with complementary stretches of DNA 
fragments. Depending on the application envisioned, one will desire to employ varying 

1 5 conditions of hybridization to achieve varying degrees of selectivity of probe towards target 
sequence. For applications requiring high selectivity, one will typically desire to employ 
relatively stringent conditions to form the hybrids, e.g., one will select relatively low salt 
and/or high temperature conditions, such as provided by about 0.02 M to about 0.15 M NaCl 
at temperatures of about 50°C to about 70°C. Such selective conditions tolerate little, if any, 

20 mismatch between the probe and the template or target strand, and would be particularly 
suitable for isolating crystal protein-encoding DNA segments. Detection of DNA segments 
via hybridization is well-known to those of skill in the art, and the teachings of U. S. Patents 
4,965,188 and 5,176,995 (each incorporated herein by reference) are exemplary of the 
methods of hybridization analyses. Teachings such as those found in the texts of Maloy et 

25 al, 1994; Segal 1976; Prokop, 1991; and Kuby, 1994, are particularly relevant. 

Of course, for some applications, for example, where one desires to prepare 
mutants employing a mutant primer strand hybridized to an underlying template or where one 
seeks to isolate crystal protein-encoding sequences from related species, functional 
equivalents, or the like, less stringent hybridization conditions will typically be needed in 

30 order to allow formation of the heteroduplex. In these circumstances, one may desire to 
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employ conditions such as about 0.15 M to about 0.9 M salt, at temperatures ranging from 
about 20°C to about 55°C. Cross-hybridizing species can thereby be readily identified as 
positively hybridizing signals with respect to control hybridizations. In any case, it is 
generally appreciated that conditions can be rendered more stringent by the addition of 
5 • increasing amounts of formamide, which serves to destabilize the hybrid duplex in the same 
manner as increased temperature. Thus, hybridization conditions can be readily manipulated, 
and thus will generally be a method of choice depending on the desired results. 

In certain embodiments, it will be advantageous to employ nucleic acid sequences 
of the present invention in combination with an appropriate means, such as a label, for 

1 0 determining hybridization. A wide variety of appropriate indicator means are known in the 
art, including fluorescent, radioactive, enzymatic or other ligands, such as avidin/biotin, 
which are capable of giving a detectable signal. In preferred embodiments, one will likely 
desire to employ a fluorescent label or an enzyme tag, such as urease, alkaline phosphatase or 
peroxidase, instead of radioactive or other environmental undesirable reagents. In the case of 

15 enzyme tags, colorimetric indicator substrates are known that can be employed to provide a 
means visible to the human eye or spectrophotometrically, to identify specific hybridization 
with complementary nucleic acid-containing samples. 

In general, it is envisioned that the hybridization probes described herein will be 
useful both as reagents in solution hybridization as well as in embodiments employing a solid 

20 phase. In embodiments involving a solid phase, the test DNA (or RNA) is adsorbed or 
otherwise affixed to a selected matrix or surface. This fixed, single-stranded nucleic acid is 
then subjected to specific hybridization with selected probes under desired conditions. The 
selected conditions will depend on the particular circumstances based on the particular 
criteria required (depending, for example, on the G+C content, type of target nucleic acid, 

25 source of nucleic acid, size of hybridization probe, etc.). Following washing of the 
hybridized surface so as to remove nonspecifically bound probe molecules, specific 
hybridization is detected, or even quantitated, by means of the label. 
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4.9 Characteristics of Modified Cry3 6-Endotoxins 

The present invention provides novel polypeptides that define a whole or a 
portion of a B. thuringiensis cry3Bb.60, cry3Bb. 11221, cry3Bb.l 1222. cry3Bb.l 1223, 
cry3Bb. 11224, cry3Bb. 1 1 225, cry3Bb.l 1226, cry3Bb.l 1227, cty3Bb.l 1228, cry3Bb.l 1229, 
cry3Bb. 11230, cry3Bb.l 1231, cry3Bb.l 1232, cry3Bb.l 1233, cry3Bb.l 1234, cry3Bb.l 1235, 
cry3Bb.H236, cry3Bb.l 1237, cry3Bb.l 1238, cry3Bb.l 1239, cry3Bb.l 1241, cry3Bb.l 1242, 
cry3Bb. 11032, cry3Bb. 1 1035, cry3Bb.l 1036, cry3Bb.l 1046, cry3Bb.l 1048, cry3Bb.l 1051, 
cry3Bb.H057, cry3Bb.l 1058, cry3Bb.l 1081, cry3Bb.l 1082, cry3Bb.l 1083, cry3Bb.l 1084, 
cry3Bb. 11095 and cry3Bb.l 7095-encoded crystal protein. 

4.10 Crystal Protein Nomenclature 

The inventors have arbitrarily assigned the designations Cry3Bb.60, 



Cry3Bb.ll221, 
Cry 3Bb. 11226, 
Cry3Bb. 11231, 
Cry3Bb. 11236, 
Cry3Bb.ll242, 
Cry3Bb.ll048, 



Cry3Bb. 11222, 
Cry3Bb. 11227, 
Cry3Bb. 11232, 
Cry3Bb. 11237, 
Cry3Bb. 11032, 
Cry3Bb. 11051, 



Cry3Bb. 11224, 
Cry3Bb. 11229, 
Cry3Bb. 11234, 
Cry3Bb. 11239, 
Cry3Bb. 11036, 
Cry3Bb. 11058, 



Cry3Bb.ll225, 
Cry3Bb. 11230, 
Cry3Bb.U235, 
Cry3Bb. 11241, 
Cry3Bb. 11046, 
Cry3Bb. 11081, 



30 



Cry3Bb. 11223, 
Cry3Bb. 11228, 
Cry3Bb.ll233, 
Cry3Bb. 11238, 
Cry3Bb.ll035, 
Cry3Bb. 11057, 

Cry3Bb.H082, Cry3Bb.ll083, Cry3Bb.l 1084, Cry3Bb.ll095 and Cry3Bb.ll098 to the 
novel proteins of the invention. 

Likewise, the arbitrary designations of cry3Bb.60, cry3Bb. 11221, cry3Bb.l 1222, 
cry3Bb.H223, cry3Bb.l 1224, cry3Bb.H225, cry3Bb.U226, cry3Bb.l 1227, cry3Bb.l 1228, 
cry3Bb.ll229, cry3Bb.l 1230, cry3Bb.U231, cry3Bb.H232, cry3Bb.U233, cry3Bb.l 1234, 
cry3Bb.H235, cry3Bb.l 1236, cry3Bb.l 1237, cry3Bb.l 1238, cry3Bb.U239, cry3Bb.l 1241, 
cry3Bb.U242, cry3Bb.l 1032, cry3Bb.l 1035, cry3Bb.l 1036, cry3Bb.l 1046, cry3Bb.l 1048, 
cry3Bb.U051, cry3Bb.l 1057, cry3Bb.l 1058, cry3Bb.l 1081, cry3Bb.l 1082, cry3Bb.l 1083, 
cry3Bb. 11084, cry3Bb. 11095 and Cry3Bb. 11098 have been assigned to the novel nucleic 
acid sequences which encode these polypeptides, respectively. While formal assignment of 
gene and protein designations based on the revised nomenclature of crystal protein 
endotoxins (Table 1) may be made by the committee on the nomenclature of B. thuringiensis. 
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any re-designations of the compositions of the present invention are also contemplated to be 
fully within the scope of the present disclosure. 

4.11 Transformed Host Cells and Transgenic Plants 

5 A bacterium, a yeast cell, or a plant cell or a plant transformed with an expression 

vector of the present invention is also contemplated. A transgenic bacterium, yeast cell, plant 
cell or plant derived from such a transformed or transgenic cell is also one aspect of the 
invention. 

Such transformed host cells are often desirable for use in the production of endo- 

10 toxins and for expression of the various DNA gene constrcuts disclosed herein. In some as- 
pects of the invention, it is often desirable to modulate, regulate, or otherwise control the ex- 
pression of the gene segments disclosed herein. Such methods are routine to those of skill in 
the molecular genetic arts. Typically, when increased or over-expression of a particular gene 
is desired, various manipulations may be employed for enhancing the expression of the mes- 

1 5 senger RNA, particularly by using an active promoter, as well as by employing sequences, 
which enhance the stability of the messenger RNA in the particular transformed host cell. 

Typically, the initiation and translational termination region will involve stop 
codon(s), a terminator region, and optionally, a polyadenylation signal. In the direction of 
transcription, namely in the 5' to 3' direction of the coding or sense sequence, the construct 

20 will involve the transcriptional regulatory region, if any, and the promoter, where the regula- 
tory region may be either 5' or 3' of the promoter, the ribosomal binding site, the initiation 
codon, the structural gene having an open reading frame in phase with the initiation codon, 
the stop codon(s), the polyadenylation signal sequence, if any, and the terminator region. 
This sequence as a double strand may be used by itself for transformation of a microorganism 

25 host, but will usually be included with a DNA sequence involving a marker, where the sec- 
ond DNA sequence may be joined to the 5-endotoxin expression construct during introduc- 
tion of the DNA into the host. 

By a marker is intended a structural gene which provides for selection of those 
hosts which have been modified or transformed. The marker will normally provide for se- 

30 lective advantage, for example, providing for biocide resistance, e.g., resistance to antibiotics 
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or heavy metals; complementation, so as to provide prototropy to an auxotrophic host, or the 
like. Preferably, complementation is employed, so that the modified host may not only be 
selected, but may also be competitive in the field. One or more markers may be employed in 
the development of the constructs, as well as for modifying the host. The organisms may be 
5 further modified by providing for a competitive advantage against other wild-type microor- 
ganisms in the field. For example, genes expressing metal chelating agents, e.g., sidero- 
phores, may be introduced into the host along with the structural gene expressing the 
5-endotoxin. In this manner, the enhanced expression of a siderophore may provide for a 
competitive advantage for the 5-endotoxin-producing host, so that it may effectively compete 

10 with the wild-type microorganisms and stably occupy a niche in the environment. 

Where no functional replication system is present, the construct will also include a 
sequence of at least 50 basepairs (bp), preferably at least about 100 bp, and usually not more 
than about 1000 bp of a sequence homologous with a sequence in the host. In this way, the 
probability of legitimate recombination is enhanced, so that the gene will be integrated into 

15 the host and stably maintained by the host. Desirably, the 8-endotoxin gene will be in close 
proximity to the gene providing for complementation as well as the gene providing for the 
competitive advantage. Therefore, in the event that a 8-endotoxin gene is lost, the resulting 
organism will be likely to also lose the complementing gene and/or the gene providing for the 
competitive advantage, so that it will be unable to compete in the environment with the gene 

20 retaining the intact construct. 

The crystal protein-encoding gene can be introduced between the transcriptional 
and translational initiation region and the transcriptional and translational termination region, 
so as to be under the regulatory control of the initiation region. This construct will be in- 
cluded in a plasmid, which will include at least one replication system, but may include more 

25 than one, where one replication system is employed for cloning during the development of 
the plasmid and the second replication system is necessary for functioning in the ultimate 
host. In addition, one or more markers may be present, which have been described previ- 
ously. Where integration is desired, the plasmid will desirably include a sequence homolo- 
gous with the host genome. 
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The transformants can be isolated in accordance with conventional ways, usually 
employing a selection technique, which allows for selection of the desired organism as 
against unmodified organisms or transferring organisms, when present. The transformants 
then can be tested for pesticidal activity. 
5 Suitable host cells, where the pesticide-containing cells will be treated to prolong 

the activity of the 5-endotoxin in the cell when the then treated cell is applied to the environ- 
ment of target pest(s), may include either prokaryotes or eukaryotes, normally being limited 
to those cells which do not produce substances toxic to higher organisms, such as mammals. 
However, organisms which produce substances toxic to higher organisms could be used, 

10 where the 5-endotoxin is unstable or the level of application sufficiently low as to avoid any 
possibility of toxicity to a mammalian host. As hosts, of particular interest will be the pro- 
karyotes and the lower eukaryotes, such as fxmgi. Illustrative prokaryotes, both Gram- 
negative and -positive, include Enter obacteriaceae, such as Escherichia, Erwinia, Shigella, 
Salmonella, and Proteus; Bacillaceae\ Rhizobiceae, such as Rhizobium; Spirillaceae, such as 

1 5 photobacterium, Zymomonas, Serratia, Aeromonas, Vibrio, Desulfovibdo, Spirillum; Lacto- 
bacillaceae; phylloplane organisms such as members of the Pseudomonadaceae (including 
Pseudomonas spp. and Acetobacter spp.); Azotobacteraceae and Nitrobacteraceae; Flavo- 
bacterium spp.; members of the Bacillaceae such as Lactobacillus spp., Bifidobacterium, and 
Bacillus spp., and the like. Particularly preferred host cells include Pseudomonas aerugi- 

20 nosa, Pseudomonas fluorescens, Bacillus thuringiensis, Escherichia coli, Bacillus subtilis, 
and the like. 

Among eukaryotes are fungi, such as Phycomycetes and Ascomycetes, which in- 
cludes yeast, such as Schizosaccharomyces; and Basidiomycetes, Rhodotorula, Aureobasid- 
ium, Sporobolomyces, Saccharomyces spp., and Sporobolomyces spp. 

25 Characteristics of particular interest in selecting a host cell for purposes of pro- 

duction include ease of introducing the 6-endotoxin gene into the host, availability of ex- 
pression systems, efficiency of expression, stability of the pesticide in the host, and the pres- 
ence of auxiliary genetic capabilities. Characteristics of interest for use as a pesticide micro- 
capsule include protective qualities for the pesticide, such as thick cell walls, pigmentation, 

30 and intracellular packaging or formation of inclusion bodies; leaf affinity; lack of mammalian 
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toxicity; attractiveness to pests for ingestion; ease of killing and fixing without damage to the 
5-endotoxin; and the like. Other considerations include ease of formulation and handling, 
economics, storage stability, and the like. 

The cell will usually be intact and be substantially in the proliferative form when 
5 treated, rather than in a spore form, although in some instances spores may be employed. 
Treatment of the recombinant microbial cell can be done as disclosed infra. The treated cells 
generally will have enhanced structural stability which will enhance resistance to environ- 
mental conditions. 

Genes or other nucleic acid segments, as disclosed herein, can be inserted into 

10 host cells using a variety of techniques which are well known in the art. For example, a large 
number of cloning vectors comprising a replication system in E. coli and a marker that 
permits selection of the transformed cells are available for preparation for the insertion of 
foreign genes into higher organisms, including plants. The vectors comprise, for example, 
pBR322, pUC series, M13mp series, pACYC184, etc. Accordingly, the sequence coding for 

15 the 5-endotoxin can be inserted into the vector at a suitable restriction site. The resulting 
plasmid is used for transformation into E. coli. The E. coli cells are cultivated in a suitable 
nutrient medium, then harvested and lysed. The plasmid is recovered. Sequence analysis, 
restriction analysis, electrophoresis, and other biochemical-molecular biological methods are 
generally carried out as methods of analysis. After each manipulation, the DNA sequence 

20 used can be cleaved and joined to the next DNA sequence. Each plasmid sequence can be 
cloned in the same or other plasmids. Depending on the method of inserting desired genes 
into the plant, other DNA sequences may be necessary. 

Methods for DNA transformation of plant cells include Agrobacterium-mediated 
plant transformation, protoplast transformation, gene transfer into pollen, injection into 

25 reproductive organs, injection into immature embryos and particle bombardment. Each of 
these methods has distinct advantages and disadvantages. Thus, one particular method of 
introducing genes into a particular plant strain may not necessarily be the most effective for 
another plant strain, but it is well known which methods are useful for a particular plant 
strain. 
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Suitable methods are believed to include virtually any method by which DNA can 
be introduced into a ceil, such as by Agrobacterium infection, direct delivery of DNA such 
as, for example, by PEG-mediated transformation of protoplasts (Omirulleh et ai, 1993), by 
desiccation/inhibition-mediated DNA uptake, by electroporation, by agitation with silicon 
5 carbide fibers, by acceleration of DNA coated particles, etc. In certain embodiments, 
acceleration methods are preferred and include, for example, microprojectile bombardment 
and the like. 

Technology for introduction of DNA into cells is well-known to those of skill in 
the art. Four general methods for delivering a gene into cells have been described: (1) 

10 chemical methods (Graham and van der Eb, 1973; Zatloukal etai, 1992); (2) physical 
methods such as microinjection (Capecchi, 1980), electroporation (Wong and Neumann, 
1982; Fromm etai, 1985) and the gene gun (Johnston and Tang, 1994; Fynan et ai, 1993); 
(3) viral vectors (Clapp, 1993; Lu etai, 1993; Egiitis and Anderson, 1988; Eglitis et ai, 
1988); and (4) receptor-mediated mechanisms (Curiel etai, 1991; 1992; Wagner etai, 

15 1992). 

A large number of techniques are available for inserting DNA into a plant host 
cell. Those techniques include transformation with T-DNA using Agrobacterium tumefaci- 
ens or Agrobactedum rhizogenes as transformation agent, fusion, injection, or electroporation 
as well as other possible methods. If agrobacteria are used for the transformation, the DNA 
20 to be inserted has to be cloned into special plasmids, namely either into an intermediate vec- 
tor or into a binary vector. The intermediate vectors can be integrated into the Ti or Ri 
plasmid by homologous recombination owing to sequences that are homologous to sequences 
in the T-DNA. The Ti or Ri plasmid also comprises the vir region necessary for the transfer 
of the T-DNA. 

25 Intermediate vectors cannot replicate themselves in agrobacteria. The intermedi- 

ate vector can be transferred into Agrobacterium tumefaciens by means of a helper plasmid 
(conjugation). Binary vectors can replicate themselves both in E. coli and in agrobacteria. 
They comprise a selection marker gene and a linker or polylinker which are framed by the 
right and left T-DNA border regions. They can be transformed directly into agrobacteria 

30 (Holsters et ai, 1978). The agrobacterium used as host cell is to comprise a plasmid carrying 
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a vir region. The vir region is necessary for the transfer of the T-DNA into the plant cell. 
Additional t-DNA may be contained. The bacterium so transformed is used for the transfor- 
mation of plant cells. Plant explants can advantageously be cultivated with Agrobacterium 
tumefaciens or Agrobacterium rhizogenes for the transfer of the DNA into the plant cell. 
5 Whole plants can then be regenerated from the infected plant material (for example, pieces of 
leaf, segments of stalk, roots, but also protoplasts or suspension-cultivated cells) in a suitable 
medium, which may contain antibiotics or biocides for selection. The plants so obtained can 
then be tested for the presence of the inserted DNA. No special demands are made of the 
plasmids in the case of injection and electroporation. It is possible to use ordinary plasmids, 

10 such as, for example, pUC derivatives. If, for example, the Ti or Ri plasmid is used for the 
transformation of the plant cell, then at least the right border, but often the right and the left 
border of the Ti or Ri plasmid T-DNA, has to be joined as the flanking region of the genes to 
be inserted. The use of T-DNA for the transformation of plant cells has been intensively re- 
searched and sufficiently described in Eur. Pat. Appl. No. EP 120 516; Hockema (1985); An 

15 et al % 1985, Herrera-Estrella et ai, (1983), Bevan et aL, (1983), and Klee et ai, (1985). 

A particularly useful Ti plasmid cassette vector for transformation of dicotyledon- 
ous plants consists of the enhanced CaMV35S promoter (EN35S) and the 3' end including 
polyadenylation signals from a soybean gene encoding the a'-subunit of P-conglycinin. 
Between these two elements is a multilinker containing multiple restriction sites for the in- 

20 sertion of genes of interest. 

The vector preferably contains a segment of pBR322 which provides an origin of 
replication in E. coli and a region for homologous recombination with the disarmed T-DNA 
in Agrobacterium strain ACO; the oriV region from the broad host range plasmid RK1; the 
streptomycin/spectinomycin resistance gene from Tn7; and a chimeric NPTII gene, contain- 

25 ing the CaMV35S promoter and the nopaline synthase (NOS) 3' end, which provides ka- 
namycin resistance in transformed plant cells. 

Optionally, the enhanced CaMV35S promoter may be replaced with the 1.5 kb 
mannopine synthase (MAS) promoter (Velten et ai, 1984). After incorporation of a DNA 
construct into the vector, it is introduced into A. tumefaciens strain ACO which contains a 

-91- 

A 135535<2WKVOM.DOC) 



disarmed Ti plasmid. Cointegrate Ti plasmid vectors are selected and subsequentially may be 
used to transform a dicotyledonous plant. 

A. tumefaciens ACO is a disarmed strain similar to pTiB6SE described by Fraley 
et al (1985). For construction of ACO the starting Agrobacterium strain was the strain A208 
5 which contains a nopaline-type Ti plasmid. The Ti plasmid was disarmed in a manner simi- 
lar to that described by Fraley et al (1985) so that essentially all of the native T-DNA was 
removed except for the left border and a few hundred base pairs of T-DNA inside the left 
border. The remainder of the T-DNA extending to a point just beyond the right border was 
replaced with a novel piece of DNA including (from left to right) a segment of pBR322, the 

10 oriV region from plasmid RK2, and the kanamycin resistance gene from Tn601. The 
pBR322 and oriV segments are similar to these segments and provide a region of homology 
for cointegrate formation. 

Once the inserted DNA has been integrated in the genome, it is relatively stable 
there and, as a rule, does not come out again. It normally contains a selection marker that 

15 confers on the transformed plant cells resistance to a biocide or an antibiotic, such as ka- 
namycin, G 4 1 8, bleomycin, hygromycin, or chloramphenicol, inter alia. The individually 
employed marker should accordingly permit the selection of transformed cells rather than 
cells that do not contain the inserted DNA. 

20 4.1 LI Electroporation 

The application of brief, high-voltage electric pulses to a variety of animal and 
plant cells leads to the formation of nanometer-sized pores in the plasma membrane. DNA is 
taken directly into the cell cytoplasm either through these pores or as a consequence of the 
redistribution of membrane components that accompanies closure of the pores. 
25 Electroporation can be extremely efficient and can be used both for transient expression of 
clones genes and for establishment of cell lines that carry integrated copies of the gene of 
interest. Electroporation, in contrast to calcium phosphate-mediated transfection and 
protoplast fusion, frequently gives rise to cell lines that carry one, or at most a few, integrated 
copies of the foreign DNA. 
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The introduction of DNA by means of electroporation, is well-known to those of 
skill in the art. In this method, certain cell wall-degrading enzymes, such as pectin-degrading 
enzymes, are employed to render the target recipient cells more susceptible to transformation 
by electroporation than untreated cells. Alternatively, recipient cells are made more 
5 susceptible to transformation, by mechanical wounding. To effect transformation by 
electroporation one may employ either friable tissues such as a suspension culture of cells, or 
embryogenic callus, or alternatively, one may transform immature embryos or other 
organized tissues directly. One would partially degrade the cell walls of the chosen cells by 
exposing them to pectin-degrading enzymes (pectolyases) or mechanically wounding in a 
10 controlled manner. Such cells would then be recipient to DNA transfer by electroporation, 
which may be carried out at this stage, and transformed cells then identified by a suitable 
selection or screening protocol dependent on the nature of the newly incorporated DNA. 

4.11.2 MlCROPROJECTILE BOMBARDMENT 

15 A further advantageous method for delivering transforming DNA segments to 

plant cells is microprojectile bombardment. In this method, particles may be coated with 
nucleic acids and delivered into cells by a propelling force. Exemplary particles include 
those comprised of tungsten, gold, platinum, and the like. 

An advantage of microprojectile bombardment, in addition to it being an effective 

20 means of reproducibly stably transforming monocots, is that neither the isolation of 
protoplasts (Cristou et ai 9 1988) nor the susceptibility to Agrobacterium infection is 
required. An illustrative embodiment of a method for delivering DNA into maize cells by 
acceleration is a Biolistics Particle Delivery System, which can be used to propel particles 
coated with DNA or cells through a screen, such as a stainless steel or Nytex screen, onto a 

25 filter surface covered with corn cells cultured in suspension. The screen disperses the 
particles so that they are not delivered to the recipient cells in large aggregates. It is believed 
that a screen intervening between the projectile apparatus and the cells to be bombarded 
reduces the size of projectiles aggregate and may contribute to a higher frequency of 
transformation by reducing damage inflicted on the recipient cells by projectiles that are too 

30 large. 
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For the bombardment, cells in suspension are preferably concentrated on filters or 
solid culture medium. Alternatively, immature embryos or other target cells may be arranged 
on solid culture medium. The cells to be bombarded are positioned at an appropriate distance 
below the macroprojectile stopping plate. If desired, one or more screens are also positioned 
5 between the acceleration device and the cells to be bombarded. Through the use of 
techniques set forth herein one may obtain up to 1000 or more foci of cells transiently 
expressing a marker gene. The number of cells in a focus which express the exogenous gene 
product 48 hours post-bombardment often range from 1 to 10 and average 1 to 3. 

In bombardment transformation, one may optimize the prebombardment culturing 

10 conditions and the bombardment parameters to yield the maximum numbers of stable 
transformants. Both the physical and biological parameters for bombardment are important 
in this technology. Physical factors are those that involve manipulating the 
DNA/microprojectile precipitate or those that affect the flight and velocity of either the 
macro- or microprojectiles. Biological factors include all steps involved in manipulation of 

15 cells before and immediately after bombardment, the osmotic adjustment of target cells to 
help alleviate the trauma associated with bombardment, and also the nature of the 
transforming DNA, such as linearized DNA or intact supercoiled plasmids. It is believed that 
pre-bombardment manipulations are especially important for successful transformation of 
immature embryos. 

20 Accordingly, it is contemplated that one may wish to adjust various of the 

bombardment parameters in small scale studies to fully optimize the conditions. One may 
particularly wish to adjust physical parameters such as gap distance, flight distance, tissue 
distance, and helium pressure. One may also minimize the trauma reduction factors (TRFs) 
by modifying conditions which influence the physiological state of the recipient cells and 

25 which may therefore influence transformation and integration efficiencies. For example, the 
osmotic state, tissue hydration and the subculture stage or cell cycle of the recipient cells may 
be adjusted for optimum transformation. The execution of other routine adjustments will be 
known to those of skill in the art in light of the present disclosure. 
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4.1 1.3 A groba CTERWAf -Mediate d Transfer 

Agrobacterium-mediated transfer is a widely applicable system for introducing 
genes into plant cells because the DNA can be introduced into whole plant tissues, thereby 
bypassing the need for regeneration of an intact plant from a protoplast. The use of 
5 Agrobacterium-mediated plant integrating vectors to introduce DNA into plant cells is well 
known in the art. See, for example, the methods described (Fraley et al, 1985; Rogers et al, 
1987). Further, the integration of the Ti-DNA is a relatively precise process resulting in few 
rearrangements. The region of DNA to be transferred is defined by the border sequences, and 
intervening DNA is usually inserted into the plant genome as described (Spielmann et al, 

10 1986; Jorgensen et al, 1987). 

Modern Agrobacterium transformation vectors are capable of replication in E. coli 
as well as Agrobacterium, allowing for convenient manipulations as described (Klee et al, 
1985). Moreover, recent technological advances in vectors for Agrobacterium-medialed gene 
transfer have improved the arrangement of genes and restriction sites in the vectors to 

15 facilitate construction of vectors capable of expressing various polypeptide coding genes. 
The vectors described (Rogers et al., 1987), have convenient multi-linker regions flanked by 
a promoter and a polyadenylation site for direct expression of inserted polypeptide coding 
genes and are suitable for present purposes. In addition, Agrobacterium containing both 
armed and disarmed Ti genes can be used for the transformations. In those plant strains 

20 where Agrobacterium-mediated transformation is efficient, it is the method of choice because 
of the facile and defined nature of the gene transfer. 

Agrobacterium-mzdiaizd transformation of leaf disks and other tissues such as 
cotyledons and hypocotyls appears to be limited to plants that Agrobacterium naturally 
infects. Agrobacterium-mediaXed transformation is most efficient in dicotyledonous plants. 

25 Few monocots appear to be natural hosts for Agrobacterium, although transgenic plants have 
been produced in asparagus using Agrobacterium vectors as described (Bytebier et al., 1 987). 
Therefore, commercially important cereal grains such as rice, com, and wheat must usually 
be transformed using alternative methods. However, as mentioned above, the transformation 
of asparagus using Agrobacterium can also be achieved (see, for example, Bytebier et al.* 

30 1987). 
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A transgenic plant formed using Agrobacteriwn transformation methods typically 
contains a single gene on one chromosome. Such transgenic plants can be referred to as 
being heterozygous for the added gene. However, inasmuch as use of the word 
"heterozygous" usually implies the presence of a complementary gene at the same locus of 
5 the second chromosome of a pair of chromosomes, and there is no such gene in a plant 
containing one added gene as here, it is believed that a more accurate name for such a plant is 
an independent segregant, because the added, exogenous gene segregates independently 
during mitosis and meiosis. 

More preferred is a transgenic plant that is homozygous for the added structural 

10 gene; i.e., a transgenic plant that contains two added genes, one gene at the same locus on 
each chromosome of a chromosome pair. A homozygous transgenic plant can be obtained by 
sexually mating (selfing) an independent segregant transgenic plant that contains a single 
added gene, germinating some of the seed produced and analyzing the resulting plants 
produced for enhanced carboxylase activity relative to a control (native, non-transgenic) or an 

1 5 independent segregant transgenic plant. 

It is to be understood that two different transgenic plants can also be mated to 
produce offspring that contain two independently segregating added, exogenous genes. 
Selfing of appropriate progeny can produce plants that are homozygous for both added, 
exogenous genes that encode a polypeptide of interest. Back-crossing to a parental plant and 

20 out-crossing with a non-transgenic plant are also contemplated. 

Transformation of plant protoplasts can be achieved using methods based on 
calcium phosphate precipitation, polyethylene glycol treatment, electroporation, and 
combinations of these treatments (see, e.g., Potrykus et at, 1985; Lorz et al, 1985; Fromm et 
al, 1985; Uchimiya et al, 1986; Callis et al, 1987; Marcotte et al, 1988). 

25 Application of these systems to different plant strains depends upon the ability to 

regenerate that particular plant strain from protoplasts. Illustrative methods for the 
regeneration of cereals from protoplasts are described (Fujimura et al, 1985; Toriyama et al, 
1986; Yamada et al, 1986; Abdullah et al, 1986). 

To transform plant strains that cannot be successfully regenerated from 

30 protoplasts, other ways to introduce DNA into intact cells or tissues can be utilized. For 
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example, regeneration of cereals from immature embryos or explants can be effected as 
described (Vasil, 1988). In addition, "particle gun" or high-velocity microprojectile 
technology can be utilized (Vasil, 1992). 

Using that latter technology, DNA is carried through the cell wall and into the 
5 cytoplasm on the surface of small metal particles as described (Klein et al, 1987; Klein et al. 9 
1988; McCabe et al., 1988). The metal particles penetrate through several layers of cells and 
thus allow the transformation of cells within tissue explants. 

4.11.4 Gene Expression in Plants 

10 Although great progress has been made in recent years with respect to preparation 

of transgenic plants which express bacterial proteins such as B. thuringiensis crystal proteins, 
the results of expressing native bacterial genes in plants are often disappointing. Unlike mi- 
crobial genetics, little was known by early plant geneticists about the factors which affected 
heterologous expression of foreign genes in plants. In recent years, however, several poten- 

1 5 tial factors have been implicated as responsible in varying degrees for the level of protein ex- 
pression from a particular coding sequence. For example, scientists now know that maintain- 
ing a significant level of a particular mRNA in the cell is indeed a critical factor. Unfortu- 
nately, the causes for low steady state levels of mRNA encoding foreign proteins are many. 
First, full length RNA synthesis may not occur at a high frequency. This could, for example, 

20 be caused by the premature termination of RNA during transcription or due to unexpected 
mRNA processing during transcription. Second, full length RNA may be produced in the 
plant cell, but then processed (splicing, polyA addition) in the nucleus in a fashion that cre- 
ates a nonfunctional mRNA. If the RNA is not properly synthesized, terminated and poly- 
adenylated; it cannot move to the cytoplasm for translation. Similarly, in the cytoplasm, if 

25 mRNAs have reduced half lives (which are determined by their primary or secondary se- 
quence) insufficient protein product will be produced. In addition, there is an effect, whose 
magnitude is uncertain, of translational efficiency on mRNA half-life. In addition, every 
RNA molecule folds into a particular structure, or perhaps family of structures, which is de- 
termined by its sequence. The particular structure of any RNA might lead to greater or lesser 

30 stability in the cytoplasm. Structure per se is probably also a determinant of mRNA process- 
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ing in the nucleus. Unfortunately, it is impossible to predict, and nearly impossible to de- 
termine, the structure of any RNA (except for tRNA) in vitro or in vivo. However, it is likely 
that dramatically changing the sequence of an RNA will have a large effect on its folded 
structure It is likely that structure per se or particular structural features also have a role in 
5 determining RNA stability. 

To overcome these limitations in foreign gene expression, researchers have iden- 
tified particular sequences and signals in RNAs that have the potential for having a specific 
effect on RNA stability. In certain embodiments of the invention, therefore, there is a desire 
to optimize expression of the disclosed nucleic acid segments in planta. One particular 
10 method of doing so, is by alteration of the bacterial gene to remove sequences or motifs 
which decrease expression in a transformed plant cell. The process of engineering a coding 
sequence for optimal expression in planta is often referred to as "plantizing" a DNA se- 
quence. 

Particularly problematic sequences are those which are A+T rich. Unfortunately, 

15 since B. thuringiensis has an A+T rich genome, native crystal protein gene sequences must 
often be modified for optimal expression in a plant. The sequence motif ATTTA (or 
AUUUA as it appears in RNA) has been implicated as a destabilizing sequence in mammal- 
ian cell mRNA (Shaw and Kamen, 1986). Many short lived mRNAs have A+T rich 3' un- 
translated regions, and these regions often have the ATTTA sequence, sometimes present in 

20 multiple copies or as multimers (e.g., ATTTATTTA,..). Shaw and Kamen showed that the 
transfer of the 3' end of an unstable mRNA to a stable RNA (globin or VA1) decreased the 
stable RNA's half life dramatically. They further showed that a pentamer of ATTTA had a 
profound destabilizing effect on a stable message, and that this signal could exert its effect 
whether it was located at the 3' end or within the coding sequence. However, the number of 

25 ATTTA sequences and/or the sequence context in which they occur also appear to be impor- 
tant in determining whether they function as destabilizing sequences. Shaw and Kamen 
showed that a trimer of ATTTA had much less effect than a pentamer on mRNA stability and 
a dimer or a monomer had no effect on stability (Shaw and Kamen, 1987). Note that mul- 
timers of ATTTA such as a pentamer automatically create an A+T rich region. This was 

30 shown to be a cytoplasmic effect, not nuclear. In other unstable mRNAs, the ATTTA se- 
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quence may be present in only a single copy, but it is often contained in an A+T rich region. 
From the animal cell data collected to date, it appears that ATTTA at least in some contexts 
is important in stability, but it is not yet possible to predict which occurrences of ATTTA are 
destabiling elements or whether any of these effects are likely to be seen in plants. 
5 Some studies on mRNA degradation in animal cells also indicate that RNA deg- 

radation may begin in some cases with nucleolytic attack in A+T rich regions. It is not clear 
if these cleavages occur at ATTTA sequences. There are also examples of mRNAs that have 
differential stability depending on the cell type in which they are expressed or on the stage 
within the cell cycle at which they are expressed. For example, histone mRNAs are stable 

10 during DNA synthesis but unstable if DNA synthesis is disrupted. The 3' end of some his- 
tone mRNAs seems to be responsible for this effect (Pandey and Marzluff, 1987). It does not 
appear to be mediated by ATTTA, nor is it clear what controls the differential stability of this 
mRNA. Another example is the differential stability of IgG mRNA in B lymphocytes during 
B cell maturation (Genovese and Milcarek, 1988). A final example is the instability of a 

15 mutant P-thallesemic globin mRNA. In bone marrow cells, where this gene is normally ex- 
pressed, the mutant mRNA is unstable, while the wild-type mRNA is stable. When the mu- 
tant gene is expressed in HeLa or L cells in vitro, the mutant mRNA shows no instability 
(Lim et ai 9 1988). These examples all provide evidence that mRNA stability can be medi- 
ated by cell type or cell cycle specific factors. Furthermore this type of instability is not yet 

20 associated with specific sequences. Given these uncertainties, it is not possible to predict 
which RNAs are likely to be unstable in a given cell. In addition, even the ATTTA motif 
may act differentially depending on the nature of the cell in which the RNA is present. Shaw 
and Kamen (1987) have reported that activation of protein kinase C can block degradation 
mediated by ATTTA. 

25 The addition of a polyadenylate string to the 3' end is common to most eukaryotic 

mRNAs, both plant and animal. The currently accepted view of polyA addition is that the 
nascent transcript extends beyond the mature 3' terminus. Contained within this transcript 
are signals for polyadenylation and proper 3' end formation. This processing at the 3' end 
involves cleavage of the mRNA and addition of polyA to the mature 3' end. By searching for 

30 consensus sequences near the polyA tract in both plant and animal mRNAs, it has been pos- 
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sible to identify consensus sequences that apparently are involved in polyA addition and 3' 
end cleavage. The same consensus sequences seem to be important to both of these proc- 
esses. These signals are typically a variation on the sequence AATAAA. In animal cells, 
some variants of this sequence that are functional have been identified; in plant cells there 
5 seems to be an extended range of functional sequences (Wickens and Stephenson, 1984; 
Dean et aL t 1986). Because all of these consensus sequences are variations on AATAAA, 
they all are A+T rich sequences. This sequence is typically found 15 to 20 bp before the 
polyA tract in a mature mRNA. Studies in animal cells indicate that this sequence is in- 
volved in both polyA addition and 3' maturation. Site directed mutations in this sequence 

10 can disrupt these ftinctions (Conway and Wickens, 1988; Wickens et al., 1987), However, it 
has also been observed that sequences up to 50 to 100 bp 3' to the putative polyA signal are 
also required; i.e., a gene that has a normal AATAAA but has been replaced or disrupted 
downstream does not get properly polyadenylated (Gil and Proudfoot, 1984; Sadofsky and 
Alwine, 1984; McDevitt et al, 1984). That is, the polyA signal itself is not sufficient for 

15 complete and proper processing. It is not yet known what specific downstream sequences are 
required in addition to the polyA signal, or if there is a specific sequence that has this func- 
tion. Therefore, sequence analysis can only identify potential polyA signals. 

In naturally occurring mRNAs that are normally polyadenylated, it has been ob- 
served that disruption of this process, either by altering the polyA signal or other sequences 

20 in the mRNA, profound effects can be obtained in the level of functional mRNA. This has 
been observed in several naturally occurring mRNAs, with results that are gene-specific so 
far. 

It has been shown that in natural mRNAs proper polyadenylation is important in 
mRNA accumulation, and that disruption of this process can effect mRNA levels signifi- 
25 cantly. However, insufficient knowledge exists to predict the effect of changes in a normal 
gene. In a heterologous gene, it is even harder to predict the consequences. However, it is 
possible that the putative sites identified are dysfunctional. That is, these sites may not act as 
proper polyA sites, but instead function as aberrant sites that give rise to unstable mRNAs. 

In animal cell systems, AATAAA is by far the most common signal identified in 
30 mRNAs upstream of the poly A, but at least four variants have also been found (Wickens and 
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Stephenson, 1984). In plants, not nearly so much analysis has been done, but it is clear that 
multiple sequences similar to AATAAA can be used. The plant sites in Table 5 called major 
or minor refer only to the study of Dean et al. (1986) which analyzed only three types of 
plant gene. The designation of polyadenylation sites as major or minor refers only to the fre- 
5 quency of their occurrence as functional sites in naturally occurring genes that have been 
analyzed. In the case of plants this is a very limited database. It is hard to predict with any 
certainty that a site designated major or minor is more or less likely to function partially or 
completely when found in a heterologous gene such as those encoding the crystal proteins of 
the present invention. 

10 
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Table 5 

Poly aden ylation Sites In Plant Genes 



PA 


AATAAA 


Major consensus site 


P1A 


AATAAT 


Major plant site 


P2A 


AACCAA 


Minor plant site 


P3A 


ATATAA 


it 


P4A 


AATCAA 




P5A 


ATACTA 


n 


P6A 


ATAAAA 


ii 


P7A 


ATGAAA 


it 


P8A 


AAGCAT 




P9A 


ATTAAT 




P10A 


ATACAT 


n 


P11A 


AAAATA 


ti 


P12A 


ATT AAA 


Minor animal site 


P13A 


AATTAA 


i» 


P14A 


AATACA 




P15A 


CATAAA 


it 



The present invention provides a method for preparing synthetic plant genes 
5 which genes express their protein product at levels significantly higher than the wild-type 
genes which were commonly employed in plant transformation heretofore. In another aspect, 
the present invention also provides novel synthetic plant genes which encode non-plant pro- 
teins. 

As described above, the expression of native B. thuringiensis genes in plants is 
10 often problematic. The nature of the coding sequences of B. thuringiensis genes distin- 
guishes them from plant genes as well as many other heterologous genes expressed in plants. 
In particular, B. thuringiensis genes are very rich (--62%) in adenine (A) and thymine (T) 
while plant genes and most other bacterial genes which have been expressed in plants are on 
the order of 45-55% A+T. 
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Due to the degeneracy of the genetic code and the limited number of codon 
choices for any amino acid, most of the "excess" A+T of the structural coding sequences of 
some Bacillus species are found in the third position of the codons. That is, genes of some 
Bacillus species have A or T as the third nucleotide in many codons. Thus A+T content in 
5 part can determine codon usage bias. In addition, it is clear that genes evolve for maximum 
function in the organism in which they evolve. This means that particular nucleotide se- 
quences found in a gene from one organism, where they may play no role except to code for a 
particular stretch of amino acids, have the potential to be recognized as gene control elements 
in another organism (such as transcriptional promoters or terminators, polyA addition sites, 

10 intron splice sites, or specific mRNA degradation signals). It is perhaps surprising that such 
misread signals are not a more common feature of heterologous gene expression, but this can 
be explained in part by the. relatively homogeneous A+T content (-50%) of many organisms. 
This A+T content plus the nature of the genetic code put clear constraints on the likelihood of 
occurrence of any particular oligonucleotide sequence. Thus, a gene from £. coli with a 50% 

15 A+T content is much less likely to contain any particular A+T rich segment than a gene from 
B. thuringiensis. 

Typically, to obtain high-level expression of the S-endotoxin genes in plants, ex- 
isting structural coding sequence ("structural gene") which codes for the S-endotoxin are 
modified by removal of ATTTA sequences and putative polyadenylation signals by site di- 

20 rected mutagenesis of the DNA comprising the structural gene. It is most preferred that sub- 
stantially all the polyadenylation signals and ATTTA sequences are removed although en- 
hanced expression levels are observed with only partial removal of either of the above iden- 
tified sequences. Alternately if a synthetic gene is prepared which codes for the expression 
of the subject protein, codons are selected to avoid the ATTTA sequence and putative poly- 

25 adenylation signals. For purposes of the present invention putative polyadenylation signals 
include, but are not necessarily limited to, AATAAA, AATAAT, AACCAA, ATATAA, 
AATCAA, ATACTA, ATAAAA, ATGAAA, AAGCAT, ATTAAT, ATACAT, AAAATA, 
ATT AAA, AATTAA, AATACA and CATAAA. In replacing the ATTTA sequences and 
polyadenylation signals, codons are preferably utilized which avoid the codons which are 

30 rarely found in plant genomes. 
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The selected DNA sequence is scanned to identify regions with greater than four 
consecutive adenine (A) or thymine (T) nucleotides. The A+T regions are scanned for po- 
tential plant polyadenylation signals. Although the absence of five or more consecutive A or 
T nucleotides eliminates most plant polyadenylation signals, if there are more than one of the 
5 minor polyadenylation signals identified within ten nucleotides of each other, then the nu- 
cleotide sequence of this region is preferably altered to remove these signals while maintain- 
ing the original encoded amino acid sequence. 

The second step is to consider the about 15 to about 30 or so nucleotide residues 
surrounding the A+T rich region identified in step one. If the A+T content of the surround- 
10 ing region is less than 80%, the region should be examined for polyadenylation signals. Al- 
teration of the region based on polyadenylation signals is dependent upon (1) the number of 
polyadenylation signals present and (2) presence of a major plant polyadenylation signal. 

The extended region is examined for the presence of plant polyadenylation sig- 
nals. The polyadenylation signals are removed by site-directed mutagenesis of the DNA se- 
15 quence. The extended region is also examined for multiple copies of the ATTTA sequence 
which are also removed by mutagenesis. 

It is also preferred that regions comprising many consecutive A+T bases or G+C 
bases are disrupted since these regions are predicted to have a higher likelihood to form 
hairpin structure due to self-complementarity. Therefore, insertion of heterogeneous base 
20 pairs would reduce the likelihood of self-complementary secondary structure formation 
which are known to inhibit transcription and/or translation in some organisms. In most cases, 
the adverse effects may be minimized by using sequences which do not contain more than 
five consecutive A+T or G+C. 

25 4.11.5 Synthetic Oligonucleotides for Mutagenesis 

When oligonucleotides are used in the mutagenesis, it is desirable to maintain the 
proper amino acid sequence and reading frame, without introducing common restriction sites 
such as 5g/II, //wdlll, Sacl, Kpn\, EcoRl, Ncol, Pstl and Sail into the modified gene. These 
restriction sites are found in poly-linker insertion sites of many cloning vectors. Of course, 
30 the introduction of new polyadenylation signals, ATTTA sequences or consecutive stretches 
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of more than five A+T or G+C, should also be avoided. The preferred size for the oligonu- 
cleotides is about 40 to about 50 bases, but fragments ranging from about 18 to about 100 
bases have been utilized. In most cases, a minimum of about 5 to about 8 base pairs of ho- 
mology to the template DNA on both ends of the synthesized fragment are maintained to in- 
5 sure proper hybridization of the primer to the template. The oligonucleotides should avoid 
sequences longer than five base pairs A+T or G+C. Codons used in the replacement of wild- 
type codons should preferably avoid the TA or CG doublet wherever possible. Codons are 
selected from a plant preferred codon table (such as Table 6 below) so as to avoid codons 
which are rarely found in plant genomes, and efforts should be made to select codons to pref- 
1 0 erably adjust the G+C content to about 50%. 
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Table 6 

Preferred Codon Usage In Plants 



Amino Acid Codon Percent Usage 

in Plants 

ARG CGA T 

CGC 11 

CGG 5 

CGU 25 

AGA 29 

AGG 23 



LEU CUA 8 

CUC 20 

CUG 10 

CUU 28 

UUA 5 

UUG 30 

SER UCA 14 

UCC 26 

UCG 3 

UCU 21 

AGC 21 

AGU 15 



THR 



ACA 
ACC 
ACG 
ACU 



21 
41 
7 
31 
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Table 6 (continued) 



Amino Acid Codon Percent Usage 

in Plants 

PRO CCA 45" 

CCC 19 

CCG 9 

CCU 26 

ALA GCA 23 

GCC 32 

GCG 3 

GCU 41 

GLY GGA 32 

GGC 20 

GGG 11 

GGU 37 

ILE AUA 12 

AUC 45 

AUU 43 

VAL GUA 9 

GUC 20 

GUG 28 

GUU 43 

LYS AAA 36 

AAG 64 

ASN AAC 72 

AAU 28 

GLN CAA 64 

CAG 36 

HIS CAC 65 

CAU 35 

GLU GAA 48 

GAG 52 

ASP GAC 48 
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Table 6 (continued) 



Amino Acid 


Codon 


Percent Usage 
in Plants 




GAU 


52 


TYR 


UAC 


68 




UAU 


32 


CYS 


UGC 


78 




UGU 


22 


PHE 


UUC 


56 




UUU 


44 


MET 


AUG 


100 


TRP 


UGG 


100 



Regions with many consecutive A+T bases or G+C bases are predicted to have a 
higher likelihood to form hairpin structures due to self-complementarity. Disruption of these 
5 regions by the insertion of heterogeneous base pairs is preferred and should reduce the likeli- 
hood of the formation of self-complementary secondary structures such as hairpins which are 
known in some organisms to inhibit transcription (transcriptional terminators) and translation 
(attenuators). 

Alternatively*, a completely synthetic gene for a given amino acid sequence can be 
10 prepared, with regions of five or more consecutive A+T or G+C nucleotides being avoided. 
Codons are selected avoiding the TA and CG doublets in codons whenever possible. Codon 
usage can be normalized against a plant preferred codon usage table (such as Table 6) and the 
G+C content preferably adjusted to about 50%. The resulting sequence should be examined 
to ensure that there are minimal putative plant polyadenylation signals and ATTTA se- 
15 quences. Restriction sites found in commonly used cloning vectors are also preferably 
avoided. However, placement of several unique restriction sites throughout the gene is useful 
for analysis of gene expression or construction of gene variants. 
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4.11.6 "Plantized" Gene Constructs 

The expression of a plant gene which exists in double-stranded DNA form in- 
volves transcription of messenger RNA (mRNA) from one strand of the DNA by RNA po- 
lymerase enzyme, and the subsequent processing of the mRNA primary transcript inside the 
5 nucleus. This processing involves a 3' non-translated region which adds polyadenylate nu- 
cleotides to the 3' end of the RNA. Transcription of DNA into mRNA is regulated by a re- 
gion of DNA usually referred to as the "promoter." The promoter region contains a sequence 
of bases that signals RNA polymerase to associate with the DNA and to initiate the tran- 
scription of mRNA using one of the DNA strands as a template to make a corresponding 

10 strand of RNA. 

A number of promoters which are active in plant cells have been described in the 
literature. These include the nopaline synthase (NOS) and octopine synthase (OCS) promot- 
ers (which are carried on tumor-inducing plasmids of Agrobacterium tumefaciens), the Cau- 
liflower Mosaic Virus (CaMV) 19S and 35S promoters, the light-inducible promoter from the 

15 small subunit of ribulose bis-phosphate carboxylase (ssRUBISCO, a very abundant plant 
polypeptide) and the mannopine synthase (MAS) promoter (Velten et aL, 1984 and Velten 
and Schell, 1985). All of these promoters have been used to create various types of DNA 
constructs which have been expressed in plants (see e.g., Int. Pat. Appl. Publ. No. WO 
84/02913). 

20 Promoters which are known or are found to cause transcription of RNA in plant 

cells can be used in the present invention. Such promoters may be obtained from plants or 
plant viruses and include, but are not limited to, the CaMV35S promoter and promoters iso- 
lated from plant genes such as ssRUBISCO genes. As described below, it is preferred that 
the particular promoter selected should be capable of causing sufficient expression to result 

25 in the production of an effective amount of protein. 

The promoters used in the DNA constructs (i.e. chimeric plant genes) of the pres- 
ent invention may be modified, if desired, to affect their control characteristics. For example, 
the CaMV35S promoter may be ligated to the portion of the ssRUBISCO gene that represses 
the expression of ssRUBISCO in the absence of light, to create a promoter which is active in 

30 leaves but not in roots. The resulting chimeric promoter may be used as described herein. 
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For purposes of this description, the phrase "CaMV35S" promoter thus includes variations of 
CaMV35S promoter, e.g., promoters derived by means of ligation with operator regions, ran- 
dom or controlled mutagenesis, etc. Furthermore, the promoters may be altered to contain 
multiple "enhancer sequences" to assist in elevating gene expression. 
5 The RNA produced by a DNA construct of the present invention also contains a 5' 

non-translated leader sequence. This sequence can be derived from the promoter selected to 
express the gene, and can be specifically modified so as to increase translation of the mRNA. 
The 5' non- translated regions can also be obtained from viral RNA's, from suitable eukary- 
otic genes, or from a synthetic gene sequence. The present invention is not limited to con- 

10 structs, as presented in the following examples. Rather, the non-translated leader sequence 
can be part of the 5' end of the non-translated region of the coding sequence for the virus coat 
protein, or part of the promoter sequence, or can be derived from an unrelated promoter or 
coding sequence. In any case, it is preferred that the sequence flanking the initiation site con- 
form to the translational consensus sequence rules for enhanced translation initiation reported 

15 byKozak(1984). 

The cry DNA constructs of the present invention may also contain one or more 
modified or fully-synthetic structural coding sequences which have been changed to enhance 
the performance of the cry gene in plants. The structural genes of the present invention may 
optionally encode a fusion protein comprising an amino-terminal chloroplast transit peptide 

20 or secretory signal sequence. 

The DNA construct also contains a 3' non-translated region. The 3' non- 
translated region contains a polyadenylation signal which functions in plants to cause the 
addition of polyadenylate nucleotides to the 3' end of the viral RNA. Examples of suitable 3' 
regions are (1) the 3' transcribed, non-translated regions containing the polyadenylation sig- 

25 nal of Agrobacterium tumor-inducing (Ti) plasmid genes, such as the nopaline synthase 
(NOS) gene, and (2) plant genes like the soybean storage protein (7S) genes and the small 
subunit of the RuBP carboxylase (E9) gene. 
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4.12 Methods for Producing Insect-Resistant Transgenic Plants 

By transforming a suitable host cell, such as a plant cell, with a recombinant cry* 
gene-containing segment, the expression of the encoded crystal protein (i.e., a bacterial 
crystal protein or polypeptide having insecticidal activity against coleopterans) can result in 
5 the formation of insect-resistant plants. 

By way of example, one may utilize an expression vector containing a coding 
region for a B. thuringiensis crystal protein and an appropriate selectable marker to transform 
a suspension of embryonic plant cells, such as wheat or corn cells using a method such as 
particle bombardment (Maddock et al, 1991; Vasil et al, 1992) to deliver the DNA coated 
10 on microprojectiles into the recipient cells. Transgenic plants are then regenerated from 
transformed embryonic calli that express the insecticidal proteins. 

The formation of transgenic plants may also be accomplished using other methods 
of cell transformation which are known in the art such as Agrobacterium-medialed DNA 
transfer (Fraley et al, 1983). Alternatively, DNA can be introduced into plants by direct 
15 DNA transfer into pollen (Zhou et al, 1983; Hess, 1987; Luo et al, 1988), by injection of the 
DNA into reproductive organs of a plant (Pena et al, 1987), or by direct injection of DNA 
into the cells of immature embryos followed by the rehydration of desiccated embryos 
(Neuhaus et al, 1987; Benbrook et al, 1986). 

The regeneration, development, and cultivation of plants from single plant 
20 protoplast transformants or from various transformed explants is well known in the art 
(Weissbach and Weissbach, 1988). This regeneration and growth process typically includes 
the steps of selection of transformed cells, culturing those individualized cells through the 
usual stages of embryonic development through the rooted plantlet stage. Transgenic 
embryos and seeds are similarly regenerated. The resulting transgenic rooted shoots are 
25 thereafter planted in an appropriate plant growth medium such as soil. 

The development or regeneration of plants containing the foreign, exogenous gene 
that encodes a polypeptide of interest introduced by Agrobacterium from leaf explants can be 
achieved by methods well known in the art such as described (Horsch et al, 1985). In this 
procedure, transformants are cultured in the presence of a selection agent and in a medium 
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that induces the regeneration of shoots in the plant strain being transformed as described 
(Fraley etal, 1983). 

This procedure typically produces shoots within two to four months and those 
shoots are then transferred to an appropriate root-inducing medium containing the selective 
5 agent and an antibiotic to prevent bacterial growth. Shoots that rooted in the presence of the 
selective agent to form plantlets are then transplanted to soil or other media to allow the 
production of roots. These procedures vary depending upon the particular plant strain 
employed, such variations being well known in the art. 

Preferably, the regenerated plants are self-pollinated to provide homozygous 

10 transgenic plants, as discussed before. Otherwise, pollen obtained from the regenerated 
plants is crossed to seed-grown plants of agronomically important, preferably inbred lines. 
Conversely, pollen from plants of those important lines is used to pollinate regenerated 
plants. A transgenic plant of the present invention containing a desired polypeptide is 
cultivated using methods well known to one skilled in the art. 

1 5 Such plants can form germ cells and transmit the transformed trait(s) to progeny 

plants. Likewise, transgenic plants can be grown in the normal manner and crossed with 
plants that have the same transformed hereditary factors or other hereditary factors. The 
resulting hybrid individuals have the corresponding phenotypic properties. A transgenic 
plant of this invention thus has an increased amount of a coding region (e.g., a mutated cry 

20 gene) that encodes the mutated Cry polypeptide of interest. A preferred transgenic plant is an 
independent segregant and can transmit that gene and its activity to its progeny. A more 
preferred transgenic plant is homozygous for that gene, and transmits that gene to all of its 
offspring on sexual mating. 

Seed from a transgenic plant may be grown in the field or greenhouse, and 

25 resulting sexually mature transgenic plants are self-pollinated to generate true breeding 
plants. The progeny from these plants become true breeding lines that are evaluated for, by 
way of example, increased insecticidal capacity against coleopteran insects, preferably in the 
field, under a range of environmental conditions. The inventors contemplate that the present 
invention will find particular utility in the creation of transgenic plants of commercial interest 
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including various grasses, grains, fibers, tubers, legumes, ornamental plants, cacti, succulents, 
fruits, berries, and vegetables, as well as a number of nut- and fruit-bearing trees and plants. 

4.13 Methods for Producing Combinatorial Cry3* Variants 
5 Crystal protein mutants containing substitutions in one or more domains may be 

constructed via a number of techniques. For instance, sequences of highly related genes can 
be readily shuffled using the PCR™-based technique described by Stemmer (1994). Alter- 
natively, if suitable restriction sites are available, the mutations of one cry gene may be 
combined with the mutations of a second cry gene by routine subcloning methodologies. If a 

10 suitable restriction site is not available, one may be generated by oligonucleotide directed 
mutagenesis using any number of procedures known to those skilled in the art. Alternatively, 
splice-overlap extension PCR™ (Horton et ai, 1989) may be used to combine mutations in 
different regions of a crystal protein. In this procedure, overlapping DNA fragments gener- 
ated by the PCR™ and containing different mutations within their unique sequences may be 

1 5 annealed and used as a template for amplification using flanking primers to generate a hybrid 
gene sequence. Finally, cry* mutants may be combined by simply using one cry mutant as a 
template for oligonucleotide-directed mutagenesis using any number of protocols such as 
those described herein. 

20 4.14 Isolating Homologous Gene and Gene Fragments 

The genes and 5-endotoxins according to the subject invention include not only 
the full length sequences disclosed herein but also fragments of these sequences, or fusion 
proteins, which retain the characteristic insecticidal activity of the sequences specifically ex- 
emplified herein. 

25 It should be apparent to a person skill in this art that insecticidal 8-endotoxins can 

be identified and obtained through several means. The specific genes, or portions thereof, 
may be obtained from a culture depository, or constructed synthetically, for example, by use 
of a gene machine. Variations of these genes may be readily constructed using standard 
techniques for making point mutations. Also, fragments of these genes can be made using 

30 commercially available exonucleases or endonucleases according to standard procedures. 
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For example, enzymes such as BaB\ or site-directed mutagenesis can be used to systemati- 
cally cut off nucleotides from the ends of these genes. Also, genes which code for active 
fragments may be obtained using a variety of other restriction enzymes. Proteases may be 
used to directly obtain active fragments of these 5-endotoxins. 
5 Equivalent 5-endotoxins and/or genes encoding these equivalent 5-endotoxins can 

also be isolated from Bacillus strains and/or DNA libraries using the teachings provided 
herein. For example, antibodies to the 6-endotoxins disclosed and claimed herein can be used 
to identify and isolate other 5-endotoxins from a mixture of proteins. Specifically, antibodies 
may be raised to the portions of the 5-endotoxins which are most constant and most distinct 

10 from other B. thuringiensis 5-endotoxins. These antibodies can then be used to specifically 
identify equivalent 5-endotoxins with the characteristic insecticidal activity by immunopre- 
cipitation, enzyme linked immunoassay (ELISA), or Western blotting. 

A further method for identifying the 5-endotoxins and genes of the subject inven- 
tion is through the use of oligonucleotide probes. These probes are nucleotide sequences 

15 having a detectable label. As is well known in the art, if the probe molecule and nucleic acid 
sample hybridize by forming a strong bond between the two molecules, it can be reasonably 
assumed that the probe and sample are essentially identical. The probe's detectable label 
provides a means for determining in a known manner whether hybridization has occurred. 
Such a probe analysis provides a rapid method for identifying formicidal 5-endotoxin genes 

20 of the subject invention. 

The nucleotide segments which are used as probes according to the invention can 
be synthesized by use of DNA synthesizers using standard procedures. In the use of the nu- 
cleotide segments as probes, the particular probe is labeled with any suitable label known to 
those skilled in the art, including radioactive and non-radioactive labels. Typical radioactive 

25 labels include 32 P, 125 I, 35 S, or the like. A probe labeled with a radioactive isotope can be 
constructed from a nucleotide sequence complementary to the DNA sample by a conven- 
tional nick translation reaction, using a DNase and DNA polymerase. The probe and sample 
can then be combined in a hybridization buffer solution and held at an appropriate tempera- 
ture until annealing occurs. Thereafter, the membrane is washed free of extraneous materials, 
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leaving the sample and bound probe molecules typically detected and quantified by autora- 
diography and/or liquid scintillation counting. 

Non-radioactive labels include, for example, ligands such as biotin or thyroxine, 
as well as enzymes such as hydrolases or peroxidases, or the various chemiluminescers such 
5 as luciferin, or fluorescent compounds like fluorescein and its derivatives. The probe may 
also be labeled at both ends with different types of labels for ease of separation, as, for ex- 
ample, by using an isotopic label at the end mentioned above and a biotin label at the other 
end. 

Duplex formation and stability depend on substantial complementarity between 

10 the two strands of a hybrid, and, as noted above, a certain degree of mismatch can be toler- 
ated. Therefore, the probes of the subject invention include mutations (both single and mul- 
tiple), deletions, insertions of the described sequences, and combinations thereof, wherein 
said mutations, insertions and deletions permit formation of stable hybrids with the target 
polynucleotide of interest. Mutations, insertions, and deletions can be produced in a given 

1 5 polynucleotide sequence in many ways, by methods currently known to an ordinarily skilled 
artisan, and perhaps by other methods which may become known in the future. 

The potential variations in the probes listed is due, in part, to the redundancy of 
the genetic code. Because of the redundancy of the genetic code, i.e., more than one coding 
nucleotide triplet (codon) can be used for most of the amino acids used to make proteins. 

20 Therefore different nucleotide sequences can code for a particular amino acid. Thus, the 
amino acid sequences of the B. thuringiensis 5-endotoxins and peptides can be prepared by 
equivalent nucleotide sequences encoding the same amino acid sequence of the protein or 
peptide. Accordingly, the subject invention includes such equivalent nucleotide sequences. 
Also, inverse or complement sequences are an aspect of the subject invention and can be 

25 readily used by a person skilled in this art. In addition it has been shown that proteins of 
identified structure and function may be constructed by changing the amino acid sequence if 
such changes do not alter the protein secondary structure (Kaiser and Kezdy, 1984). Thus, 
the subject invention includes mutants of the amino acid sequence depicted herein which do 
not alter the protein secondary structure, or if the structure is altered, the biological activity is 

30 substantially retained. Further, the invention also includes mutants of organisms hosting all 
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or part of a 5-endotoxin encoding a gene of the invention. Such mutants can be made by 
techniques well known to persons skilled in the art. For example, UV irradiation can be used 
to prepare mutants of host organisms. Likewise, such mutants may include asporogenous 
host cells which also can be prepared by procedures well known in the art. 

5 

4.15 RlBOZYMES 

Ribozymes are enzymatic RNA molecules which cleave particular mRNA spe- 
cies. In certain embodiments, the inventors contemplate the selection and utilization of ri- 
bozymes capable of cleaving the RNA segments of the present invention, and their use to re- 

10 duce activity of target mRNAs in particular cell types or tissues. 

Six basic varieties of naturally-occurring enzymatic RNAs are known presently. 
Each can catalyze the hydrolysis of RNA phosphodiester bonds in trans (and thus can cleave 
other RNA molecules) under physiological conditions. In general, enzymatic nucleic acids 
act by first binding to a target RNA. Such binding occurs through the target binding portion 

15 of a enzymatic nucleic acid which is held in close proximity to an enzymatic portion of the 
molecule that acts to cleave the target RNA. Thus, the enzymatic nucleic acid first recog- 
nizes and then binds a target RNA through complementary base-pairing, and once bound to 
the correct site, acts enzymatically to cut the target RNA. Strategic cleavage of such a target 
RNA will destroy its ability to direct synthesis of an encoded protein. After an enzymatic 

20 nucleic acid has bound and cleaved its RNA target, it is released from that RNA to search for 
another target and can repeatedly bind and cleave new targets. 

The enzymatic nature of a ribozyme is advantageous over many technologies, 
such as antisense technology (where a nucleic acid molecule simply binds to a nucleic acid 
target to block its translation) since the concentration of ribozyme necessary to affect a thera- 

25 peutic treatment is lower than that of an antisense oligonucleotide. This advantage reflects 
the ability of the ribozyme to act enzymatically. Thus, a single ribozyme molecule is able to 
cleave many molecules of target RNA. In addition, the ribozyme is a highly specific inhibi- 
tor, with the specificity of inhibition depending not only on the base pairing mechanism of 
binding to the target RNA, but also on the mechanism of target RNA cleavage. Single mis- 

30 matches, or base-substitutions, near the site of cleavage can completely eliminate catalytic 
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activity of a ribozyme. Similar mismatches in antisense molecules do not prevent their action 
(Woolf et al, 1992). Thus, the specificity of action of a ribozyme is greater than that of an 
antisense oligonucleotide binding the same RNA site. 

The enzymatic nucleic acid molecule may be formed in a hammerhead, hairpin, a 
5 hepatitis 5 virus, group I intron or RNaseP RNA (in association with an RNA guide se- 
quence) or Neurospora VS RNA motif. Examples of hammerhead motifs are described by 
Rossi etal (1992); examples of hairpin motifs are described by Hampel etal (Eur. Pat. EP 
0360257), Hampel and Tritz (1989), Hampel etal (1990) and Cech et al (U. S. Patent 
5,631,359; an example of the hepatitis 5 virus motif is described by Perrotta and Been 

10 (1992); an example of the RNaseP motif is described by Guerrier-Takada et al (1983); Neu- 
rospora VS RNA ribozyme motif is described by Collins (Saville and Collins, 1990; Saville 
and Collins, 1991; Collins and Olive, 1993); and an example of the Group I intron is de- 
scribed by Cech etal (U.S. Patent 4,987,071). All that is important in an enzymatic nucleic 
acid molecule of this invention is that it has a specific substrate binding site which is com- 

15 plementary to one or more of the target gene RNA regions, and that it have nucleotide se- 
quences within or surrounding that substrate binding site which impart an RNA cleaving ac- 
tivity to the molecule. Thus the ribozyme constructs need not be limited to specific motifs 
mentioned herein. 

The invention provides a method for producing a class of enzymatic cleaving 
20 agents which exhibit a high degree of specificity for the RNA of a desired target. The enzy- 
matic nucleic acid molecule is preferably targeted to a highly conserved sequence region of a 
target mRNA such that specific treatment of a disease or condition can be provided with ei- 
ther one or several enzymatic nucleic acids. Such enzymatic nucleic acid molecules can be 
delivered exogenously to specific cells as required. Alternatively, the ribozymes can be ex- 
25 pressed from DNA or RNA vectors that are delivered to specific cells. 

Small enzymatic nucleic acid motifs (e.g., of the hammerhead or the hairpin 
structure) may be used for exogenous delivery. The simple structure of these molecules in- 
creases the ability of the enzymatic nucleic acid to invade targeted regions of the mRNA 
structure. Alternatively, catalytic RNA molecules can be expressed within cells from eu- 
30 karyotic promoters (e.g., Scanlon etal, 1991; Kashani-Sabet etal, 1992; Dropulic etal, 
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1992; Weerasinghe et aL, 1991 ; Ojwang et al, 1992; Chen et al % 1992; Sarver et al, 1990). 
Those skilled in the art realize that any ribozyme can be expressed in eukaryotic cells from 
the appropriate DNA vector. The activity of such ribozymes can be augmented by their re- 
lease from the primary transcript by a second ribozyme (Draper etal, Int. Pat. Appl. Publ. 
5 No. WO 93/23569, and Sullivan etal, Int. Pat. Appl. Publ. No. WO 94/02595, both hereby 
incorporated in their totality by reference herein; Ohkawa etal, 1992; Taira etal, 1991; 
Ventura etal, 1993). 

Ribozymes may be added directly, or can be complexed with cationic lipids, lipid 
complexes, packaged within liposomes, or otherwise delivered to target cells. The RNA or 
10 RNA complexes can be locally administered to relevant tissues ex vivo, or in vivo through 
injection, aerosol inhalation, infusion pump or stent,, with or without their incorporation in 
biopolymers. 

Ribozymes may be designed as described in Draper et al (Int. Pat. Appl. Publ. 
No. WO 93/23569), or Sullivan et al, (Int. Pat. Appl. Publ. No. WO 94/02595) and synthe- 

1 5 sized to be tested in vitro and in vivo, as described. Such ribozymes can also be optimized 
for delivery. While specific examples are provided, those in the art will recognize that 
equivalent RNA targets in other species can be utilized when necessary. 

Hammerhead or hairpin ribozymes may be individually analyzed by computer 
folding (Jaeger era/., 1989) to assess whether the ribozyme sequences fold into the appro- 

20 priate secondary structure. Those ribozymes with unfavorable intramolecular interactions 
between the binding arms and the catalytic core are eliminated from consideration. Varying 
binding arm lengths can be chosen to optimize activity. Generally, at least 5 bases on each 
arm are able to bind to, or otherwise interact with, the target RNA. 

Ribozymes of the hammerhead or hairpin motif may be designed to anneal to 

25 various sites in the mRNA message, and can be chemically synthesized. The method of 
synthesis used follows the procedure for normal RNA synthesis as described in Usman et al 
(1987) and in Scaringe etal. (1990) and makes use of common nucleic acid protecting and 
coupling groups, such as dimethoxytrityl at the 5'-end, and phosphoramidites at the 3'-end. 
Average stepwise coupling yields are typically >98%. Hairpin ribozymes may be synthe- 

30 sized in two parts and annealed to reconstruct an active ribozyme (Chowrira and Burke, 
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1992). Ribozymes may be modified extensively to enhance stability by modification with 
nuclease resistant groups, for example, 2'-amino, 2'-C-allyl, 2'-flouro, 2'-o-methyl, 2'-H (for 
a review see Usman and Cedergren, 1 992). Ribozymes may be purified by gel electrophore- 
sis using general methods or by high pressure liquid chromatography and resuspended in 
5 water. 

Ribozyme activity can be optimized by altering the length of the ribozyme bind- 
ing arms, or chemically synthesizing ribozymes with modifications that prevent their degra- 
dation by serum ribonucleases (see e.g., Int. Pat. Appl. Publ. No. WO 92/07065; Perrault et 
al, 1990; Pieken etal, 1991; Usman and Cedergren, 1992; Int. Pat. Appl. Publ. No. WO 

10 93/15187; Int. Pat Appl. Publ. No. WO 91/03162; Eur. Pat. Appl. Publ. No. 921 10298.4; 
U.S. Patent 5,334,711; and Int. Pat. Appl. Publ. No. WO 94/13688, which describe various 
chemical modifications that can be made to the sugar moieties of enzymatic RNA mole- 
cules), modifications which enhance their efficacy in cells, and removal of stem II bases to 
shorten RNA synthesis times and reduce chemical requirements. 

15 Sullivan et al (Int. Pat. Appl. Publ. No. WO 94/02595) describes the general 

methods for delivery of enzymatic RNA molecules. Ribozymes may be administered to cells 
by a variety of methods known to those familiar to the art, including, but not restricted to, 
encapsulation in liposomes, by iontophoresis, or by incorporation into other vehicles, such as 
hydrogels, cyclodextrins, biodegradable nanocapsules, and bioadhesive microspheres. For 

20 some indications, ribozymes may be directly delivered ex vivo to cells or tissues with or 
without the aforementioned vehicles. Alternatively, the RNA/vehicle combination may be 
locally delivered by direct inhalation, by direct injection or by use of a catheter, infusion 
pump or stent. Other routes of delivery include, but are not limited to, intravascular, intra- 
muscular, subcutaneous or joint injection, aerosol inhalation, oral (tablet or pill form), topi- 

25 cal, systemic, ocular, intraperitoneal and/or intrathecal delivery. More detailed descriptions 
of ribozyme delivery and administration are provided in Sullivan et al (Int. Pat. Appl. Publ. 
No. WO 94/02595) and Draper etal (Int. Pat. Appl. Publ. No. WO 93/23569) which have 
been incorporated by reference herein. 

Another means of accumulating high concentrations of a ribozyme(s) within cells 

30 is to incorporate the ribozyme-encoding sequences into a DNA expression vector. Tran- 
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scription of the ribozyme sequences are driven from a promoter for eukaryotic RNA po- 
lymerase I (pol I), RNA polymerase II (pol II), or RNA polymerase III (pol III). Transcripts 
from pol II or pol III promoters will be expressed at high levels in all cells; the levels of a 
given pol II promoter in a given cell type will depend on the nature of the gene regulatory 
5 sequences (enhancers, silencers, etc.) present nearby. Prokaryotic RNA polymerase promot- 
ers may also be used, providing that the prokaryotic RNA polymerase enzyme is expressed in 
the appropriate cells (Elroy-Stein and Moss, 1990; Gao and Huang, 1993; Lieber et ai, 1993; 
Zhou etai, 1990). Ribozymes expressed from such promoters can function in mammalian 
cells {e.g. Kashani-Saber etal, 1992; Ojwang etal, 1992; Chen etai, 1992; Yu etai, 

10 1993; L'Huillier et aL, 1992; Lisziewicz et al 9 1993). Such transcription units can be incor- 
porated into a variety of vectors for introduction into mammalian cells, including but not re- 
stricted to, plasmid DNA vectors, viral DNA vectors (such as adenovirus or adeno-associated 
vectors), or viral RNA vectors (such as retroviral, semliki forest virus, sindbis virus vectors). 

Ribozymes of this invention may be used as diagnostic tools to examine ge- 

15 netic drift and mutations within cell lines or cell types. They can also be used to assess levels 
of the target RNA molecule. The close relationship between ribozyme activity and the 
structure of the target RNA allows the detection of mutations in any region of the molecule 
which alters the base-pairing and three-dimensional structure of the target RNA. By using 
multiple ribozymes described in this invention, one may map nucleotide changes which are 

20 important to RNA structure and function in vitro, as well as in cells and tissues. Cleavage of 
target RNAs with ribozymes may be used to inhibit gene expression and define the role 
(essentially) of specified gene products in particular cells or cell types. 

5.0 Examples 

25 The following examples are included to demonstrate preferred embodiments of 

the invention. It should be appreciated by those of skill in the art that the techniques dis- 
closed in the examples which follow represent techniques discovered by the inventor to 
function well in the practice of the invention, and thus can be considered to constitute pre- 
ferred modes for its practice. However, those of skill in the art should, in light of the present 

30 disclosure, appreciate that many changes can be made in the specific embodiments which are 
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disclosed and still obtain a like or similar result without departing from the spirit and scope 
of the invention. 

5.1 Example 1 - Three-Dimensional Structure of Cry3Bb 

5 The three-dimensional structure of Cry3Bb was determined by X-ray crystallog- 

raphy. Crystallization of Cry3Bb and X-ray diffraction data collection were performed as 
described by Cody et ai (1992). The crystal structure of Cry3Bb was refined to a residual R 
factor of 18.0% using data collected to 2.4 A resolution. The crystals belong to the space 
group C222j with unit cell dimensions a = 122.44, b = 131.81, and c = 105.37 A and contain 

10 one molecule in the asymmetric unit. Atomic coordinates for Cry3Bb are described in Ex- 
ample 3 1 and listed in Section 9. 

The structure of Cry3Bb is similar to that of Cry3A (Li et ai, 1991). It consists of 
5825 protein atoms from 588 residues (amino acids 64 - 652) forming three discrete domains 
(FIG. 1). A total of 251 water molecules have been identified in the Cry3Bb structure 

15 (FIG. 2). Domain 1 (residues 64 - 294) is a seven helical bundle formed by six helices 
twisted around the central helix, ot5 (FIG. 3). The amino acids forming each helix are listed 
in FIG. 4. Domain 2 (residues 295 - 502) contains three antiparallel P-sheets (FIG. 5A and 
FIG. 5B). Sheets 1 and 2, each composed of 4 p strands, form the distinctive "Greek key" 
motif. The outer surface of sheet 3, composed of 3 p strands, makes contact with helix a 7 of 

20 domain 1. FIG. 6 lists the amino acids comprising each p strand in domain 2. A small a he- 
lix, a8 which follows p strand 1, is also included in domain 2. Domain 3 (residues 503 - 
652) has a "jelly roll" P-barrel topology which has a hydrophobic core and is nearly parallel 
to the a and perpendicular to the c axes of the lattice (FIG. 7A and FIG. 7B). The amino ac- 
ids comprising each P strand of domain 3 are listed in FIG. 8. 

25 The monomers of Cry3Bb in the crystal form a dimeric quaternary structure along 

a two-fold axis parallel to the a axis (FIG. 9A and FIG. 9B). Helix a6 lies in a cleft formed 
by the interface of domain 1 and domains 1 and 3 of its symmetry related molecule. There 
are numerous close hydrogen bonding contacts along this surface, confirming the structural 
stability of the dimer. 
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5.2 Example 2 - Preparation of Cry3Bb.60 

B. thuringiensis EG7231 was grown through sporulation in C2 medium with chlo- 
ramphenicol (Cml) selection. The solids from this culture were recovered by centrifiigation 
5 and washed with water. The toxin was purified by recrystallization from 4.0 M NaBr (Cody 
et al 9 1992). The purified Cry3Bb was soiubilized in 10 ml of 50 mM KOH/100 mg 
Cry3Bb and buffered to pH 9.0 with 100 mM CAPS (pH 9.0). The soluble toxin was treated 
with trypsin at a weight ratio of 50 mg toxin to 1 mg trypsin. After 20 min of trypsin diges- 
tion the predominant protein visualized by SDS-polyacrylamide gel electrophoresis 
10 (SDS-PAGE) was 60 kDa. Further digestion of the 60-kDa toxin was not observed. FIG. 4 
illustrates the Coomassie-stained Cry3Bb and Cry3Bb.60 following SDS-PAGE. 

5.3 Example 3 - Purification and Sequencing of Cry3Bb.60 

Cry3Bb.60 was electrophoretically purified by SDS-PAGE and electroblotted to 
15 Immobilon-P® (Millipore) membrane by semi-dry transfer at 15V for 30 min. The mem- 
brane was then washed twice with water and stained with 0.025% R-250, 40% methanol. To 
reduce the background, the blot was destained with 50% methanol until the stained protein 
bands were visible. The blot was then air dried, and the stained Cry3Bb.60 band was cut out 
of the membrane. This band was sent to the Tufts University Sequencing Laboratory 
20 (Boston, MA) for N-terminal sequencing. The experimentally-determined N-terminal amino 
acid sequence is shown in Table 7 beside the known amino acid sequence starting at amino 
acid residue 160. 
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Table 7 

Amino Acid Sequence of the N-Terminus of Cry3Bb.60 and 
Comparison to the Known Sequence of Cry3Bb 



Deduced Sequence Known Sequence Residue # 



s 


S 


160 


K 


K 


161 


R 


R 


162 


S 


S 


163 


Q 


Q 


164 


D 


D 


165 


R 


R 


166 


5.4 


Example 4 - Bioactivity of Cry3Bb.60 





Cry3Bb was prepared for bioassay by solubilization in a minimal amount of 50 
mM KOH, 10 ml per 100 mg toxin, and buffered to pH 9.0 with 100 mM CAPS, pH 9.0. 
Cry3Bb.60 was prepared as described in Example 1. Both preparations were kept at room 
temperature 12 to 16 hours prior to bioassay. After seven days the mortality of the popula- 
10 tion was determined and analyzed to determine the lethal concentration of each toxin. These 
results are numerized in Table 8. 

Table 8 

Bioactivity of Cry3Bb and Cry3Bb.60 Against the Southern Corn Rootworm 
1 5 (dlabiotica undecimpuncta ta) 





LC 50 mg/well 


95% C. I. 


Cry3Bb 


24.09 


15-39 


Cry3Bb.60 


6.72 


5.25 - 8.4 



5.5 Example 5 - Ion-Channel Formation by Cry3Bb and CryB2.60 

Cry3Bb.60 and Cry3Bb were evaluated for their ability to form ion channels in 
planar lipid bilayers. Bilayers of phosphatidylcholine were formed on Teflon® supports over 

-123- 

A IJ55J5<2WKVDJ!£>00 



a 0.7-mm hole. A bathing solution of 3.5 ml 100 mM KOH, 10 mM CaCI,, 100 mM CAPS 
(pH 9.5) was placed on either side of the Teflon partition. The toxin was added to one side 
of the partition and a voltage of 60 mV was imposed across the phosphatidylcholine bilayer. 
Any leakage of ions through the membrane was amplified and recorded. An analysis of the 
5 frequency of the conductances created by either Cry3Bb or Cry3Bb.60 are illustrated in FIG. 
5A and FIG. 5B. Cry3Bb.60 readily formed ion channels whereas Cry3Bb rarely formed 
channels. 

5.6 Example 6 - Formation of High Molecular-Weight Oligomers 

10 Individual molecules of Cry3Bb or Cry3Bb.60 form a complex with another like 

molecule. The ability of Cry3Bb to form an oligomer is not reproducibly apparent. The 
complex cannot be repeatedly observed to form under nondenaturing conditions. Cry3Bb.60 
formed a significantly greater amount of a higher molecular-weight complex (>120 kDa) 
with other Cry3Bb.60 molecules. Oligomers of Cry3Bb are demonstrated by the intensity of 

15 the Coomassie-stained SDS polyacrylamide gel. Oligomerization is visualized on 
SDS-PAGE by not heating samples prior to loading on the gel to retain some nondenatured 
toxin. These data suggest that Cry3Bb.60 more readily forms the higher order complex than 
Cry3Bb alone. Oligomerization is also observed by studying the conductance produced by 
these molecules and the time-dependent increase in conductance. This change in conduc- 

20 tance can be attributed to oligomerization of the toxin. 

5.7 Example 7 - Design Method 1: Identification and Alteration of 

Protease-Sensitive Sites and Proteolytic Processing 
It has been reported in the literature that treatment of Cry3A toxin protein with 
25 trypsin, an enzyme that cleaves proteins on the carboxyl side of available lysine and arginine 
residues, yields a stable cleavage product of 55 kDa from the 67 kDa native protein (Carroll 
et ai, 1989). N-terminal sequencing of the 55 kDa product showed cleavage occurs at amino 
acid residue R158. The truncated Cry3A protein was found to retain the same level of insec- 
ticidal activity as the native protein. Cry3Bb toxin protein was also treated with trypsin. 
30 After digestion, the protein size decreased from 68 kDa, the molecular weight of the native 
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Cry3Bb toxin, to 60 kDa. No further digestion was observed. N-terminal sequencing re- 
vealed the trypsin cleavage site of the truncated toxin (Cry3Bb.60) to be amino acid R 159 in 
la3,4 of Cry3Bb. Unexpectedly, the bioactivity of the truncated Cry3Bb toxin was found to 
increase. 

5 Using this method, protease digestion of a B. thuringiensis toxin protein, a prote- 

olytically sensitive site was identified on Cry3Bb, and a more highly active form of the pro- 
tein (Cry3Bb.60) was identified. Modifications to this proteolytically-sensitive site by intro- 
ducing an additional protease recognition site also resulted in the isolation of a biologically 
more active protein. It is also possible that removal of other prot ease-sensitive site(s) may 
10 improve activity. Proteolytically sensitive regions, once identified, may be modified or util- 
ized to produce biologically more active toxins. 

5.7.1 CRY3BB.60 

Treatment of solubilized Cry3Bb toxin protein with trypsin results in the isolation 
15 of a stable, truncated Cry3Bb toxin protein with a molecular weight of 60 kDa (Cry3Bb.60). 
N-terminal sequencing of Cry3Bb.60 shows the trypsin-sensitive site to be R159 in la3,4 of 
the native toxin. Trypsin digestion results in the removal of helices 1-3 from the native 
Cry3Bb but also increases the activity of the toxin against SCRW larvae approximately four- 
fold. 

20 Cry3Bb.60 is a unique toxin with enhanced insecticidal use over the parent 

Cry3Bb. Improved biological activity, is only one parameter that distinguishes it as a new 
toxin. Aside from the reduced size, Cry3Bb.60 is also a more soluble protein. Cry3Bb pre- 
cipitates from solution at pH 6.5 while Cry3Bb.60 remains in solution from pH 4.5 to pH 12. 
Cry3Bb.60 also forms ion channels with greater frequency than Cry3Bb. 

25 Cry3Bb.60 is produced by either the proteolytic removal of the first 159 amino 

acid residues, or the in vivo production of this toxin, by bacteria or plants expressing the gene 
for Cry3Bb.60, that is, the Cry3Bb gene without the first 483 nucleotides. 

In conclusion, Cry3Bb.60 is distinct from Cry3Bb in several important ways: en- 
hanced insecticidal activity; enhanced range of solubility; enhanced ability to form channels; 

30 and reduced size. 
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5.7.2 EG11221 

Semi-random mutagenesis of the trypsin-sensitive la3,4 region of Cry3Bb re- 
sulted in the isolation of Cry3Bb.l 1221, a designed Cry3Bb protein that exhibits over a 6- 
5 fold increase in activity against SCRW larvae compared to WT. Cry3Bb.l 1221 has 4 amino 
acid changes in the la3,4 region. One of these changes, L158R, introduces an additional 
trypsin site adjacent to R159, the proteolytically sensitive site used to produce Cry3Bb.60 
(example 4.1.1). Cry3Bb.l 1221 is produced by B. thuringiensis as a full length toxin protein 
but is presumably digested by insect gut proteases to the same size as Cry3Bb.60 (see Cry3A 
10 results from Carroll et al., 1989). The additional protease recognition site may make the 
la3,4 region even more sensitive to digestion, thereby increasing activity. 

5.8 Example 8 - Design Method 2: Determination and Manipulation of 
Bound Water 

15 There are several ways that water molecules can associate with a protein, includ- 

ing surface water that is easily removed and bound water that is more difficult to extract 
(Dunitz, 1994; Zhang and Matthews, 1994). The function of bound water has been the sub- 
ject of significant academic extrapolation, but the precise function has little experimental 
validation. Some of the most interesting bound or structural water is the water that partici- 

20 pates in the protein structure from inside the protein itself. 

The occupation of a site by a water molecule can indicate a stable pocket within a 
protein or a looseness of packing created by water-mediated salt bridges and hydrogen 
bonding to water. This can reduce the degree of bonding between amino acids, possibly 
making the region more flexible. A different amino acid sequence around that same site 

25 could result in better packing, collapsing the pocket around polar or charged amino acids. 
This may result in decreased flexibility. Therefore, the degree of hydration of a region of a 
protein may determine the flexibility or mobility of that region, and manipulation of the hy- 
dration may alter the flexibility. Methods of increasing the hydration of a water-exposed re- 
gion include increasing the number of hydrophobic residues along that surface. It is taught in 

30 the art that exposed hydrophobic residues require significantly more water to hydrate than 

-126- 

A I35535(2WKVOI!.DOC) 



hydrophilic residues (CRC Handbook of Chemistry and Physics, CRC Press, Inc.). It is not 
taught, however, that by doing this, improvements to the biological activity of a protein can 
be achieved. 

Structural water has not previously been identified in 5. thuhngiensis 5- 
5 endotoxins including Cry3Bb. Furthermore, there are no reports of the function of this 
structural water in 5-endotoxins or bacterial toxins. In the analysis of Cry3Bb, it was ob- 
served that a collection of water molecules are located around la3,4, a site defined by the in- 
ventors as important for improvement of bioactivity. The loop oc3,4 region is surface ex- 
posed and may define a hinge in the protein permitting either removal or movement of the 

10 first three helices of domain 1 . The hydration found around this region may impart flexibility 
and mobility to this loop. The observation of structural water at the Ia3,4 site provided an 
analytical tool for further structure analysis. If this important site is surrounded by water, 
then other important sites may also be completely or partially surrounded by water. Using 
this insight, structural water surrounding helices 5 and 6 was then identified. This structural 

1 5 water forms a column through the protein, effectively separating helices 5 and 6 from the rest 
of the molecule. The structures of Cry3A and Cry3Bb suggest that helices 5 and 6 are tightly 
associated, bound together by Van der Waals interactions. Alone, helix 5 from Cry3A, al- 
though insufficient for biological activity, has been demonstrated to have the ability to form 
ion channels in an artificial membrane (Gazit and Shai, 1993). The ion channels formed by 

20 helix 5 are 1 0-fold smaller than the channels of the full length toxin suggesting that signifi- 
cantly more toxin structure is required for the full-sized ion channels. In Cry3Bb, helix 5 as 
part of a cluster of a helices (domain 1) has been found to form ion channels (Von Tersch et 
aL, 1994). Unpublished experimental observations by the inventors demonstrate that helix 6 
also crossed the biological membrane. Helices 5 and 6, therefore, are the putative channel- 

25 forming helices necessary for toxicity. 

The hydration around these helices may indicate that flexibility of this region is 
necessary for toxicity. It is conceivable, therefore, that if it were possible to improve the hy- 
dration around helices 5 and 6, one could create a better toxin protein. Care must be taken, 
however, to avoid creating continuous hydrophobic surfaces between helices 5-6 and any 

30 other part of the protein which could, by hydrophobic interactions, act to restrict movement 
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of the mobile helices. The mobility of helices 5 and 6 may also depend on the flexibility of 
the loops attached to them as well as on other regions of the Cry3Bb molecule, particularly in 
domain I, which may undergo conformational changes to allow insertion of the 2 helices into 
the membrane. Altering the hydration of these regions of the protein may also affect its bio- 
5 activity. 

5.8.1 CRY3BB.11032 

A collection of bound water residues indicated the relative flexibility of the la3,4 
region. The flexibility of this loop can be increased by increasing the hydration of the region 
1 0 by substituting relatively hydrophobic residues for the exposed hydrophilic residues. An ex- 
ample of an improved, designed protein having this type of substitution is Cry3Bb.l 1032. 
Cry3Bb.l 1032 has the amino acid change D165G; glycine is more hydrophobic than aspar- 
tate (Kyte and Doolittle hydrophobicity score of -0.4 vs. -3.5 for aspartate). Cry3Bb. 1 1032 is 
approximately 3 times more active than WT Cry3Bb. 

15 

5.8.2 CRY3BB.11051 

To increase the hydration of the la4,5 region of Cry3Bb, glycine was substituted 
for the surface exposed residue K189. Glycine is more hydrophobic than lysine (Kyte and 
Doolittle hydrophobicity score of -0.4 vs. -3.9 for lysine) and may result in an increase in 
20 bound water. The increase in bound water may impart greater flexibility to the loop region 
which precedes the channel-forming helix, <x5. The designed Cry3Bb protein with the 
K189G change, Cry3Bb. 11051, exhibits a 3-fold increase in activity compared to WT 
Cry3Bb. 

25 5.8.3 Alterations to La7,pi (Cry3Bb.1 1241 and 1 1242) 

Amino acid changes made in the surface-exposed loop connecting oc-helix 7 and 
P-strand 1 (la7,pi) resulted in the identification of 2 altered Cry3Bb proteins with increased 
bioactivities, Cry3Bb. 11241 and Cry3Bb.ll242. Analysis of the hydropathy index of 2 of 
these proteins over the 20 amino acid sequence 281-300, inclusive of the lcc7,pi region, re- 
30 veal that the amino acid substitutions in these proteins have made the la7,pl region much 
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more hydrophobic. The grand average of hydropathy value (GRAVY) was determined for 
each protein sequence using the PQGENE (IntelliGenetics, Inc., Mountain View, CA, re- 
lease 6.85) protein sequence analysis computer program, SOAP, and a 7 amino acid interval. 
The SOAP program is based on the method of Kyte and Doolittle (1982). The increase in 
5 hydrophobicity of the la7,(51 region for each protein may increase the hydration of the loop 
and, therefore, the flexibility. The altered proteins, their respective amino acid changes, 
fold-increases over WT bioactivity, and GRAVY values are listed in Table 9. 

Table 9 

10 Hydropathy Values for the La7,pi region of Cry3Bb and 2 Designed Cry3Bb 

Proteins Showing Increased SCRW Bioactivity 



Cry3Bb* 


Amino Acid Changes 


Fold Increase in 


GRAVY 


Protein 




Bioactivity Over 


(Amino Acids 281- 






WT 


300) 


wildtype 






4.50 


Cry3Bb. 11241 


Y287F, D288N, R290L 


2.6x 


10.70 


Cry3Bb. 11242 


R290V 


2.5x 


8.85 



5.8.4 Alterations to Lpl,a8 (Cry3Bb.11228, Cry3Bb.11229, Cry3Bb.11230, 
Cry3Bb.11233, Cry3Bb,11236, Cry3Bb.11237, Cry3Bb.11238 and 
15 CRY3BB.11239) 

The surface-exposed loop between P-strand 1 and a-helix 8 (lpl,a8) defines the 
boundary between domains 1 and 2 of Cry3Bb. The introduction of semi-random amino acid 
changes to this region resulted in the identification of several altered Cry3Bb proteins with 
increased bioactivity. Hydropathy index analysis of the amino acid substitutions found in the 
20 altered proteins shows that the changes have made the exposed region more hydrophobic 
which may result in increased hydration and flexibility. Table 10 lists the altered proteins, 
their respective amino acid changes and fold increases over WT Cry3Bb and the grand aver- 
age of hydropathy value (GRAVY) determined using the PQGENE® (IntelliGenetics, Inc., 
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Mountain View, CA, release 6.85) protein sequence analysis program, SOAP, over the 20 
amino acid sequence 305 - 324 inclusive of Ipha8, using a 7 amino acid interval. 



Table 10 

5 Hydropathy Values for the l(51,cx8 Region of Cry3Bb and 8 Designed Cry3Bb* 

Proteins Showing Increased SCRW Bioactivity 



Cry3Bb* 
Protein 


Amino Acid 
Changes 


Fold Increase in 
Bioactivity Over 
Wild Type 


GRAVY 
(Amino Acids 305-324) 


wildtype 


- 


- 


0.85 


Cry3Bb.I1228 


S311L.N313T, 


4.1x 


4.35 




E317K 






Cry3Bb. 11229 


S311T, E317K, 


2.5x 


2.60 




Y318C 






Cry3Bb.ll230 


S311A, L312V, 


4.7x 


3.65 




Q316W 






Cry3Bb.ll233 


S311A, Q316D 


2.2x 


2.15 


Cry3Bb. 11236 


S311I 


3.1x 


3.50 


Cry3Bb. 11237 


S311I,N313H 


5.4x 


3.65 


Cry3Bb.ll238 


N313V, T314N, 


2.6x 


9.85 




Q316M, E317V 






Cry3Bb.ll239 


N313R,L315P, 


2.8x 


3.95 




Q316L, E317A 







5.8.5 CRY3BB.11227, Cry3Bb.11241 and Cry3Bb.11242 

Amino acid Q238, located in helix 6 of Cry3Bb, has been identified as a residue 
10 that, by its large size and hydrogen bonding to R290, blocks complete hydration of the space 
between helix 6 and helix 4. Substitution of R290 with amino acids that do not form hydro- 
gen bonds or that have side chains that can not span the physical distance to hydrogen bond 
with Q238 may result in increased hydration around Q238. Q238, unable to hydrogen bond 



A. I3553$(2WKVOI!DOC) 



-130- 



to R290, may now bind water. This may increase the flexibility of the channel-forming re- 
gion. Designed proteins Cry3Bb.ll227 (R290N), Cry3Bb.ll241 (R290L) and 
Cry3Bb.l 1242 (R290V) show increased activities of approximately 2-fold, 2.6-fold and 2.5- 
fold, respectively, against SCRW larvae compared to WT. 

5 

5.9 Example 9 - Design Method 3: Manipulation of Hydrogen Bonds 
Around Mobile Regions 

Mobility of regions of a protein may be required for activity. The mobility of the 
a5,6 region, the putative channel-forming region of Cry3Bb, may be improved by decreasing 

10 the number of hydrogen bonds, including salt bridges (hydrogen bonds between oppositely 
charged amino acid side chains), between helices 5-6 and any other part of the molecule or 
dimer structure. These hydrogen bonds may impede the movement of the two helices. De- 
creasing the number of hydrogen bonds and salt bridges may improve biological activity. 
Replacement of hydrogen-bonding amino acids with hydrophobic residues must be done with 

1 5 caution to avoid creating continuous hydrophobic surfaces between helices 5-6 and any other 
part of the dimer. This may decrease mobility by increasing hydrophobic surface interac- 
tions. 

5.9.1 CRY3BB.11222 and Cry3Bb.11223 

20 Tyr230 is located on helix 6 and, in the quaternary dimer structure of Cry3Bb, this 

amino acid is coordinated with Tyr230 from the adjacent molecule. Three hydrogen bonds 
are formed between the two helices 6 in the two monomers because of this single amino acid. 
In order to improve the flexibility of helices 5-6, the helices theoretically capable of penetrat- 
ing the membrane and forming an ion channel, the hydrogen bonds across the dimer were 

25 removed by changing this amino acid and a corresponding increase in biological activity was 
observed. The designed Cry3Bb proteins, Cry3Bb.l 1222 and Cry3Bb.EGl 1223, show a 4- 
fold and 2.8-fold increase in SCRW activity, respectively, compared to WT. 
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5.9.2 CRY3BB.11051 

Designed Cry3Bb protein Cry3Bb.l 1051 has amino acid change K189G in Ia4,5 
of domain 1 . In the WT Cry3Bb structure, the exposed side chain of K 189 is close enough to 
the exposed side change of E123, located in Iot2b,3, to form hydrogen bonds. Substitution of 
5 K189 with glycine, as found in this position in Cry 3 A, removes the possibility of hydrogen 
bond formation at this site and results in a protein with a bioactivity three-fold greater than 
WT Cry3Bb. 

5.9.3 CRY3BB.11227, Cry3Bb.11241 and Cry3Bb.1 1242 

10 Amino acid Q238, located in helix 6 of Cry3Bb, has been identified as a residue 

that, by its large size and hydrogen bonding to R290, blocks complete hydration of the space 
between helix 6 and helix 4. Substitution of R290 with amino acids that do not form hydro- 
gen bonds or that have side chains that can not span the physical distance to hydrogen bond 
with Q238 may increase the flexibility of the channel-forming region. Designed proteins 

15 Cry3Bb.ll227 (R290N), Cry3Bb.ll241 (R290L) and Cry3Bb.ll242 (R290V) show in- 
creased activities of approximately 2-fold, 2.6-fold and 2.5-fold, respectively, against SCRW 
larvae compared to WT 

5.10 Example 10 - Design Method 4: Loop Analysis and Loop Design 
20 Around Flexible Helices 

Loop regions of a protein structure may be involved in numerous functions of the 
protein including, but not limited to, channel formation, quaternary structure formation and 
maintenance, and receptor binding. Cry3Bb is a channel-forming protein. The availability of 
the ion channel-forming helices of 8-endotoxins to move into the bilayer depend upon the 

25 absence of forces that hinder the process. One of the forces possibly limiting this process is 
the steric hindrance of amino acid side chains in loop regions around the critical helices. The 
literature suggests that in at least one other bacterial toxin, not a B. thuringiensis toxin, the 
toxin molecule opens up or, in scientific terms, loses some of the quaternary structure to ex- 
pose a membrane-active region (Cramer et aL, 1990). This literature does not teach how to 

30 improve the probability of this event occurring and it is not known if B. thuringiensis toxins 
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use this same process to penetrate the membrane. Reducing the steric hindrance of the amino 
acid side chains in these critical regions by reducing size or altering side chain positioning 
with the corresponding increase in biological activity was the inventive step. 

5 5.10.1 Analysis of the Loop Between Helices 3 and 4 (Cry3Bb.1 1032) 

The inventors have discovered that the first three helices of domain one could be 
cleaved from the rest of the toxin by proteolytic digestion of the loop between helices a3 and 
<x4 (Cry3Bb.60). Initial efforts to truncate the cry3Bb gene to produce this shortened, though 
more active Cry3Bb molecule, failed. For unknown reasons, B. thuringiensis failed to syn- 

10 thesize this 60-kDa molecule. It was then reasoned that perhaps the first three helices of do- 
main 1 did not have to be proteolytically removed, or equivalently, the protein did not have 
to be synthesized in this truncated form to take advantage of the Cry3Bb.60 design. It was 
observed that the protein Cry 3 A had a small amino acid near the Iot3,4 that might impart 
greater flexibility in the loop region thereby permitting the first three helices of domain 1 to 

15 move out of the way, exposing the membrane-active region. By designing a Cry3Bb mole- 
cule with a glycine residue near this loop, the steric hindrance of residues in the loop might 
be lessened. The redesigned protein, Cry3Bb.l 1032, has the amino acid change D165G, 
which replaces the larger aspartate residue (average mass of 1 15.09) with the smallest amino 
acid, glycine (average mass of 57.05). The activity of Cry3Bb.l 1032 is approximately 3-fold 

20 greater than that of the WT protein. In this way, the loop between helices <x3 and ot4 was ra- 
tionally redesigned with a corresponding increase in the biological activity. 

5.10.2 CRY3BB.11051 

The loop region connecting helices a4 and ot5 in Cry3Bb must be flexible so that 
25 the channel-forming helices <x5-a6 can penetrate into the membrane. It was noticed that 
Cry3A has a glycine residue in the middle of this loop that may impart greater flexibility. 
The corresponding change, K189G, was made in Cry3Bb and the resulting, designed protein, 
Cry3Bb. 11051, exhibits a 3-fold increase in activity against SCRW larvae compare to WT 
Cry3Bb. 
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5.10.3 Analysis of the Loop Between P-Strand 1 and Helix 8 (Cry3Bb.11228, 

Cry3Bb.11229, Cry3Bb.11230, Cry3Bb.11232, Cry3Bb.11233, 
Cry3Bb.11236, Cry3Bb.11237, Cry3Bb.11238, and Cry3BbJ 1239) 

5 The loop region located between p strand 1 of domain 2 and a helix 8 in domain 2 

is very close to the loop between a helices 6 and 7 in domain 1 . Some of the amino acids 
side chains of lpl ,a8 appear as though they may sterically impede movement of lcc6,7. Since 
I<x6,7 must be flexible for the channel-forming helices a5-oc6 to insert into the membrane, it 
was thought that re-engineering this loop may change the positioning of the side chains re- 
10 suiting in less steric hindrance. This was accomplished creating proteins with increased bio- 
logical activities ranging from 2.2 to 5.4 times greater than WT. These designed toxin pro- 
teins and their amino acid changes are listed in Table 2 as Cry3Bb.l 1228, Cry3Bb.l 1229, 
Cry3Bb.l 1230, Cry3Bb.l 1232, Cry3Bb.l 1233, Cry3Bb.l 1236, Cry3Bb.l 1237, 
Cry3Bb.l 1238, and Cry3Bb.l 1239. 

15 

5.10.4 Analysis of the Loop Between Helix 7 and P-Strand 1 (Cry3Bb.1 1227, 

Cry3Bb.11234, Cry3Bb.11241, Cry3Bb.11242, and Cry3Bb.1 1036) 

If Cry3Bb is similar to a bacterial toxin which must open up to expose a mem- 
brane active region for toxicity, it is possible that other helices in addition to the channel- 

20 forming helices must also change positions. It was reasoned that, if helices a5-a6 insert into 
the membrane, than helix a 7 may have to change positions also. It was shown in example 
4.4.3 that increasing flexibility between helix cx6 and a7 can increase activity, greater 
flexibility in the loop following helix a7, la7,pl may also increase bioactivity. Alterations 
to the l<x7,pi region of Cry3Bb resulted in the isolation of several proteins with increased 

25 activities ranging from 1 .9 to 4.3 times greater than WT. These designed proteins are listed 
in Table 7 as Cry3Bb.ll227, Cry3Bb.l 1234, Cry3Bb.ll241, Cry3Bb.l 1242, and 
Cry3Bb. 11036. 
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5.1 1 Example 1 1 » Design Method 5: Loop Design Around p Strands 
and p Sheets 

Loop regions of a protein structure may be involved in numerous functions of the 
protein including, but not limited to, channel formation, quaternary structure formation and 
5 maintenance, and receptor binding. A binding surface is often defined by a number of loops, 
as is the case with immunoglobulin G (IgG) (see Branden and Tooze, 1991, for review). 
What can not be determined at this point, however, is what loops will be important for recep- 
tor interactions just by looking at the structure of the protein in question. Since a receptor 
has not been identified for Cry3Bb, it is not even possible to compare the structure of Cry3Bb 
10 with other proteins that have the same receptor for structural similarities. To identify Cry3Bb 
loops that contribute to receptor interactions, random mutagenesis was performed on surface- 
exposed loops. 

As each loop was altered, the profile of the overall bioactivities of the resultant 
proteins were examined and compared. The loops, especially in domain 2 which appears to 
15 be unnecessary for channel activity, fall into two categories: (1) loops that could be altered 
without much change in the level of bioactivity of the resultant proteins and (2) loops where 
alterations resulted in overall loss of resultant protein bioactivity. Using this design method, 
it is possible to identify several loops important for activity. 

20 5.11.1 Analysis of Loop (5 2,3 

Semi-random mutagenesis of the loop region between (J strands 2 and 3 resulted 
in the production of structurally stable toxin proteins with significantly reduced activities 
against SCRW larvae. The ip2,3 region is highly sensitive to amino acid changes indicating 
that specific amino acids or amino acid sequences are necessary for toxin protein activity. It 
25 is conceivable, therefore, that specific changes in the lp2,3 region will increase the binding 
and, therefore, the activity of the redesigned toxin protein. 

5.11.2 Analysis of Loop p 6,7 

Semi-random mutations introduced to the loop region between p strands 6 and 7 
30 resulted in structurally stable proteins with an overall loss of SCRW bioactivity. The ip6,7 
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region is highly sensitive to amino acid changes indicating that specific amino acids or amino 
acid sequences are necessary for toxin protein activity. It is conceivable, therefore, that spe- 
cific changes in the ip6,7 region will increase the binding and, therefore, the activity of the 
redesigned toxin protein. 

5 

5.11.3 Analysis of Loop p 10,11 

Random mutations to the loop region between P strands 10 and 11 resulted in 
proteins having an overall loss of SCRW bioactivity. Loop pi0,ll is structurally close to 
and interacts with loops P2,3 and P6,7. Specific changes to individual residues within the 
10 lp 10,1 1 region may also result in increased interaction with the insect membrane, increasing 
the bioactivity of the toxin protein. 

5.11.4 CRY3BB.11095 

Loops P2,3, (36,7 and P 10, 1 1 have been identified as important for bioactivity of 
15 Cry3Bb. The 3 loops are surface-exposed and structurally close together. Amino acid Q348 
in the WT structure, located in P-strand 2 just prior to ip2,3, does not form any intramolecu- 
lar contacts. However, replacing Q348 with arginine (Q348R) results in the formation of 2 
new hydrogen-bonds between R348 and the backbone carbonyls of R487 and R488, both lo- 
cated in ipi0,l L The new hydrogen bonds may act to stabilize the structure formed by the 3 
20 loops. The designed protein carrying this change, Cry3Bb.l 1095, is 4.6-fold more active 
than WT Cry3Bb. 

5.12 Example 12 - Design Method 6: Identification and 
Re-design of Complex Electrostatic Surfaces 
25 Interactions of proteins include hydrophobic interactions (e.g., Van der Waals 

forces), hydrophilic interactions, including those between opposing charges on amino acid 
side chains (salt bridges), and hydrogen bonding. Very little is known about 8-endotoxin and 
receptor interactions. Currently, there are no literature reports identifying the types of inter- 
actions that predominate between B. thuringiensis toxins and receptors. 
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Experimentally, however, it is important to increase the strength of the 5. 
thuringiensis toxin-receptor interaction and not permit the precise determination of the 
chemical interaction to stand in the way of improving it. To accomplish this, the electrostatic 
surface of Cry3Bb was defined by solving the Poisson-Boltzman distribution around the 
5 molecule. Once this electrically defined surface was solved, it could then be inspected for 
regions of greatest diversity. It was reasoned that these electrostatically diverse regions 
would have the greatest probability of participating in the specific interactions between the B. 
thuringiensis toxin proteins and the receptor, rather than more general and non-specific inter- 
actions. Therefore, these regions were chosen for redesign, continuing to increase the elec- 
10 trostatic diversity of the regions. In addition, examination of the electrostatic interaction 
around the putative channel forming region of the toxin created insights for redesign. This 
includes identification of an electropositive residue in an otherwise negatively charged con- 
duit (see example 4.6. 1 ). 

15 5.12.1 R290 (CRY3BB.11227, Cry3Bb.11241, and Cry3Bb.1 1242) 

Examination of the Cry3Bb dimer interface along the domain 1 axis suggested 
that a pore or conduit for cations might be formed between the monomers. Electrostatic ex- 
amination of this axis lent additional credibility to this suggestion. In fact, the hypothetical 
conduit is primarily negatively charged, an observation consistent with the biophysical 

20 analysis of cation-selective, 5-endotoxin channels. If a cation channel were formed along the 
axis of the dimer, then the cation could move between the monomers relatively easily with 
only one significant hurdle. A positively charged arginine residue (R290) lies in the other- 
wise negatively charged conduit. This residue could impede the cation movement through 
the channel. Based on this analysis, R290 was changed to uncharged residues. The bioactiv- 

25 ity of redesigned proteins Cry3Bb.ll227 (R290N), Cry3Bb.ll241 (R290L) and 
Cry3Bb.ll242 (R290V) was improved approximately 2-fold, 2.6-fold and 2.5-fold, respec- 
tively. 
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5.12.2 Cry3Bb.60 

Trypsin digestion of solubilized Cry3Bb yields a stable, truncated protein with a 
molecular weight of 60 kDa (Cry3Bb.60). Trypsin digestion occurs on the carboxyl side of 
residue R159, effectively removing helices 1 through 3 from the native Cry3Bb structure. 
5 The cleavage of the first 3 helices exposes an electrostatic surface different than those found 
in the native structure. The new surface has a combination of hydrophobic, polar and 
charged characteristics that may play a role in membrane interactions. The bioactivity of 
Cry3Bb.60 is 3.6-fold greater than that of WT Cry3Bb. 

10 5.13 Example 13 - Design Method 7: Identification and Removal of 
Metal Binding Sites 

The literature teaches that the in vitro behavior of B. thuringiensis toxins can be 
increased by chelating divalent cations from the experimental system (Crawford and Harvey 
1988). It was not known, however, how these divalent cations inhibited the in vitro activity. 

15 Crawford and Harvey (1988) demonstrated that the short circuit current across the midgut 
was more severely inhibited by B. thuringiensis in the presence of EDTA, a chelator of diva- 
lent ions, than in the absence of this agent, thus suggesting that this step in the mode of action 
of B. thuringiensis could be potentiated by removing divalent ions. Similar observations 
were made using black-lipid membranes and measuring an increase in the current created by 

20 the 5- endotoxins in the presence of EDTA to chelate divalent ions. There were at least three 
possible explanations for these observations. The first explanation could be that the divalent 
ions are too large to move through a ion channel more suitable for monovalent ions, thereby 
blocking the channel. Second, the divalent ions may cover the protein in the very general 
way, thereby buffering the charge interactions required for toxin membrane interaction and 

25 limiting ion channel activity. The third possibility is that a specific metal binding site exists 
on the protein and, when occupied by divalent ions, the performance of the ion channel is 
impaired. Although the literature could not differentiate the value of one possibility over 
another, the third possibility led to an analysis of the Cry3Bb structure searching for a spe- 
cific metal binding site that might alter the probability that a toxin could form an ion channel. 

-138- 

A U553S(2WKV01« DOC) 



5.13.1 H231 (Cry3Bb.11222, Cry3Bb.11224, Crv3Bb.1 1225, and Cry3Bb.1 1226) 

A putative metal binding site is formed in the Cry3Bb dimer structure by the 
H231 residues of each monomer. The H231 residues, located in helix a6, lie adjacent to each 
5 other and close to the axis of symmetry of the dimer. Removal of this site by replacement of 
histidine with other amino acids was evaluated by the absence of EDTA-dependent ion chan- 
nel activity. The bioactivities of the designed toxin proteins, Cry3Bb.l 1222, Cry3Bb.l 1224, 
Cry3Bb.ll225 and Cry3Bb.l 1226, are increased 4-, 5-, 3.6- and 3-fold, respectively, over 
that of WT Cry3Bb. Their respective amino acid changes are listed in Table 2. 

10 

5.14 Example 14 Design Method 8: Alteration of Quaternary Structure 

Cry3Bb can exist in solution as a dimer similar to a related protein, Cry3A 
(Walters et al, 1992). However, the importance of the dimer to biological activity is not 
known because the toxin as a monomer or as a higher order structure has not been seriously 

15 evaluated. It is assumed that specific amino acid residues contribute to the formation and 
stability of the quaternary structure. Once a contributing residue is identified, alterations can 
be made to diminish or enhance the effect of that residue thereby affecting the interaction 
between monomers. Channel activity is a useful way, but by no means the only way, to as- 
sess quaternary structure of Cry3Bb and its derivatives. It has been observed that Cry3Bb 

20 creates gated conductances in membranes that grow in size with time, ultimately resulting in 
large pores in the membrane (the channel activity of WT Cry3Bb is described in Section 
12.1). It also has been observed that Cry3A forms a more stable dimer than Cry3Bb and co- 
incidentally forms higher level conductances faster (FIG. 10). This observation led the in- 
ventors to propose that oligomerization and ion channel formation (conductance size and 

25 speed of channel formation) were related. Based on this observation Cry3Bb was re- 
engineered to make larger and more stable oligomers at a faster rate. It is assumed in this 
analysis that the rate of ion channel formation and growth mirrors this process. It is also 
possible that changes in quaternary structure may not affect channel activity alone or at all. 
Alterations to quaternary structure may also affect receptor interactions, protein processing in 

30 the insect gut environment, as well as other aspects of bioactivity unknown. 
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5.14.1 CRY3BB.11048 

Comparative structural analysis of Cry 3 A and Cry3Bb led to the identification of 
structural differences between the two toxins in the ion channel-forming domain; specifically, 
5 an insertion of one amino acid between helix 2a and helix 2b in Cry3Bb. Removal of this 
additional amino acid in Cry 3B2, A104, and a D103E substitution, as in Cry3A, resulted in 
loss of channel gating and the formation of symmetrical pores. Once the pores are formed 
they remain open and allow a steady conductance ranging from 25-130 pS. This designed 
protein,. Cry3Bb. 1 1048, is 4.3 times more active than WT Cry3Bb against SCRW larvae. 

10 

5.14.2 Oligomerization of Cry3Bb.60 

Individual molecules of Cry3Bb or Cry3Bb.60 can form a complex with another 
like molecule. Oligomerization of Cry3Bb is demonstrated by SDS-PAGE, where samples 
are not heated in sample buffer prior to loading on the gel. The lack of heat treatment allows 

15 some nondenatured toxin to remain. Oligomerization is visualized following Coomassie 
staining by the appearance of a band at 2 times the molecular weight of the monomer. The 
intensity of the higher molecular weight band reflects the degree of oligomerization. The 
ability of Cry3Bb to form an oligomer is not reproducibly apparent. The complex cannot be 
repeatedly observed to form. Cry3Bb.60, however, forms a significantly greater amount of a 

20 higher molecular weight complex (120 kDa). These data suggest that Cry3Bb.60 more read- 
ily forms the higher order complex than Cry3Bb alone. Cry3Bb.60 also forms ion channels 
with greater frequency than WT Cry3Bb (see Section 5.12.9). 

5.14.3 CRY3BB.11035 

25 Changes were made in Cry3Bb to reflect the amino acid sequence in Cry3A at the 

end of la3,4 and in the beginning of helix 4. These changes resulted in the designed protein, 
Cry3Bb. 11035, that, unlike wild type Cry3Bb, forms spontaneous channels with large con- 
ductances. Cry3Bb. 11035 is also approximately three times more active against SCRW lar- 
vae than WT Cry3Bb. Cry3Bb. 1 1035 and its amino acid changes are listed in Table 1 0. 
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5.14.4 CRY3BB.11032 

CrySBb.l 1032 was altered at residue 165 in helix a4, changing an asparate to 
glycine, as found in Cry3A. Cry3Bb.l 1032 is three-fold more active than WT Cry3Bb. The 
5 channel activity of Cry3Bb.l 1032 is much like Cry3Bb except when the designed protein is 
artificially incorporated into the membrane. A 1 6-fold increase in the initial channel conduc- 
tances is observed compared to WT Cry3Bb (see Section 5.12.2), This increase in initial 
conductance presumably is due to enhanced quaternary structure, stability or higher-order 
structure. 

10 

5.14.5 EG11224 

In the WT Cry3Bb dimer structure, histidine, at position 231 in domain 1, makes 
hydrogen bond contacts with D288 (domain 1), Y230 (domain 1), and, through a network of 
water molecules, also makes contacts to D610 (domain 3), all of the opposite monomer. 

15 D610 and K235 (domain 1) also make contact. Replacing the histidine with an arginine, 
H231R, results, in one orientation, in the formation of a salt bridge to D610 of the neighbor- 
ing monomer. In a second orientation, the contacts with D288 of the neighboring monomer, 
as appear in the WT structure, are retained. In either orientation, R23 1 does not hydrogen 
bond to Y230 of the opposite monomer but does make contact with K235 which retains is 

20 contacts to K610 (V. Cody, research communication). The shifting hydrogen bonds have 
changed the interactions between the different domains of the protein in the quaternary 
structure. Overall, fewer hydrogen bonds exist between domains 1 of the neighboring 
monomers and a much stronger bond has been formed between domains 1 and 3. Channel 
activity was found to be altered. Cry3Bb.l 1224 produces small, quickly gating channels like 

25 Cry3Bb. However, unlike WT Cry3Bb, Cry3Bb.l 1224 does not exhibit P-mercaptoethanol- 
dependent activation. Replacing H231 with arginine resulted in a designed Cry3Bb protein, 
Cry3Bb.l 1224, exhibiting a 5-fold increase in bioactivity. 
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5.14.6 Cry3Bb.11226 

Cry3Bb.l 1226 is similar to Cry3Bb.l 1224, discussed in Section 4.8.5, in that the 
histidine at position 231 has been replaced. The amino acid change, H231T, results in the 
loss of p-mercaptoethanol dependent activation seen with WT Cry3Bb (see Section 5.12.1). 
5 The replacement of H231, a putative metal binding site, changes the interaction of regions in 
the quaternary structure resulting in a different type of channel activity. Cry3Bb. 11226 is 
three-fold more active than WT Cry3Bb. 

5.14.7 CRY3BB.11221 

10 Cry3Bb.l 1221 has been re-designed in the Ia3,4 region of Cry3Bb. The channels 

formed by Cry3Bb.l 1221 are much more well resolved than the conductances formed by WT 
Cry3Bb (see Section 5.12.6). Cry3Bb.ll221 exhibits a 6.4-fold increase in bioactivity over 
that of WT Cry3Bb. The amino acid changes found in Cry3Bb. 1 122 1 are listed in Table 2. 

15 5.14,8 CRY3BB.11242 

The designed protein, Cry3Bb.ll242, carrying the alteration R290V, forms small 
conductances immediately which grow rapidly and steadily to large conductances in about 3 
min (see Section 5.12.7). This is contrast to WT Cry3Bb channels which take 30-45 min to 
appear and grow slowly over hours to large conductances. Cry3Bb. 11242 also exhibits a 
20 2.5-fold increase in bioactivity compared to WT Cry3Bb. 

5.14.9 CRY3BB.11230 

Cry3Bb.l 1230, unlike WT Cry3Bb, forms well resolved channels with long open 
states. These channels reach a maximum conductance of 3000 pS but do not continue to 
25 grow with time. Cry3Bb. 11230 has been re-designed in the lpl,a8 region of Cry3Bb and 
exhibits almost a 5-fold increase in activity against SCRW larvae (Table 9) and a 5.4-fold 
increase against WCRW larvae (Table 10) compared to WT Cry3Bb. The amino acid 
changes found in Cry3Bb. 1 1230 are listed in Table 2. 
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5.15 Example 15 - Design Method 9: Design of Structural Residues 

The specific three-dimensional structure of a protein is held in place by amino 
acids that may be buried or otherwise removed from the surface of the protein. These struc- 
tural determinants can be identified by inspection of forces responsible for the surface struc- 
5 ture positioning. The impact of these structural residues can then be enhanced to restrict 
molecular motion or diminished to enhance molecular flexibility. 

5.15.1 CRY3BB. 11095 

Loops (32,3, (i6,7 and £10,11, located in domain 2 of Cry3Bb, have been identi- 
10 fied as important for bioactivity. The three loops are surface-exposed and structurally close 
together. Amino acid Q348 in the WT structure, located in P-strand 2 just prior to ip2,3, 
does not form any intramolecular contacts. However, replacing Q348 with arginine (Q348R) 
results in the formation of 2 new hydrogen-bonds between R348 and the backbone carbonyls 
of R487 and R488, both located in ipi0,l 1. The new hydrogen bonds may act to stabilize the 
15 structure formed by the three loops. Certainly, the structure around R348 is more tightly 
packed as determined by X-ray crystallography. The designed protein carrying this change, 
Cry3Bb.l 1095, is 4.6-fold more active than WT Cry3Bb. 

5.16 Example 16 - Design Method 10: Combinatorial Analysis 
20 and Mutagenesis 

Individual sites in the engineered Cry3Bb molecule can be used together to create 
a Cry3Bb molecule with activity even greater than the activity of any one site. This method 
has not been precisely applied to any 5-endotoxin. It is also not obvious that improvements 
in two sites can be pulled together to improve the biological activity of the protein. In fact, 

25 data demonstrates that improvements to 2 sites, when pulled together into a single construct, 
do not necessarily further improve the biological activity of Cry3Bb. In some cases, the 
combination resulted in decreased protein stability and/or activity. Examples of proteins with 
site combinations that resulted in improved activity compared to WT Cry3Bb but decreased 
activity compared to 1 or more of the "parental" proteins are Cry3Bb.l 1235, 11046, 11057 

30 and 1 1058. Cry3Bb. 1 1082, which contains designed regions from 4 parental proteins, retains 
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the level of activity from the most active parental strain (Cry3Bb.l 1230) but does not show 
an increase in activity. These proteins are listed in Table 7. The following are examples of 
instances where combined mutations have significantly improved biological activity. 

5 5.16.1 Cry3Bb.11231 

Designed protein Cry3Bb.ll231 contains the alterations found in Cry3Bb.ll224 
(H231R) and Cry3Bb.ll228 (changes in lpt,a8). The combination of amino acid changes 
found in Cry3Bb. 11231 results in an increase in bioactivity against SCRW larvae of ap- 
proximately 8-fold over that of WT Cry3Bb (Table 2). This increase is greater than exhibited 
10 by either Cry3Bb.l 1224 (5.0x) or Cry3Bb.ll228 (4.1x) alone. Cry3Bb.ll231 was also ex- 
hibits an 12.9-fold increase in activity compared to WT Cry3Bb against WCRW larvae 
(Table 10). 

5.16.2 Cry3Bb.11081 

15 Designed Cry3Bb protein Cry3Bb.ll081 was constructed by combining the 

changes found in Cry3Bb.ll032 and Cry3Bb.ll229 (with the exception of Y318C). 
Cry3Bb.l 1081 a 6.1 -fold increase in activity over WT Cry3Bb; a greater increase in activity 
than either of the individual parental proteins, Cry3Bb.ll032 (3.1 -fold) and Cry3Bb.ll229 
(2.5-fold). 

20 

5.16.3 CRY3BB.11083 

Designed Cry3Bb protein Cry3Bb. 11083 was constructed by combining the 
changes found in Cry3Bb.ll036 and Cry3Bb.ll095. Cry3Bb.ll083 exhibits a 7.4-fold in- 
crease in activity against SCRW larvae compared to WT Cry3Bb; a greater increase than ei- 
25 ther Cry3Bb.l 1036 (4.3x) or Cry3Bb.l 1095 (4.6x). Cry3Bb.l 1083 also exhibits a 5.4-fold 
increase in activity against WCRW larvae compared to WT Cry3Bb (Table 10). 

5.16.4 CRY3BB.11084 

Designed Cry3Bb protein Cry3Bb. 11084 was constructed by combining the 
30 changes found in Cry3Bb. 11032 and the S311L change found in Cry3Bb.ll228. 
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Cry3Bb.l 1084 exhibits a 7.2-fold increase in activity over that of WT Cry3Bb; a greater than 
either Cry3Bb. 1 1 032 (3. 1 x) or Cry3Bb. 1 1 228 (4. 1 x). 

5.16,5 Cry3Bb.11098 

5 Designed Cry3Bb protein Cry3Bb. 1 1098 was constructed to contain the following 

amino acid changes: D 1 65G, H23 1 R, S3 1 1 L, N3 1 3T, and E3 1 7K. The nucleic acid sequence 
is given in SEQ ID NO: 107, and the encoded amino acid sequence is given in SEQ ID 
NO:108. 

10 5.17 Example 17 - Design Strategy 11: Alteration of Binding to 

Glycoproteins and to WCRW Brush Border Membranes 
While the identity of receptor(s) for Cry3Bb is unknown, it is nonetheless important 
to increase the interaction of the toxin with its receptor. One way to improve the toxin- 
receptor interaction with knowing the identity of the receptor is to reduce or eliminate non- 

15 productive binding to other biomolecules. The inventors have observed that Cry3Bb binds 
non-specifically to bovine serum albumin (BSA) that has been glycosylated with a variety of 
sugar groups, but not to non-glycosylated BSA. Cry3A, which is not active on Diabrotica 
species, shows similar but even greater binding to glycosylated-BSA. Similarly, Cry3A 
shows greater binding to immobolized WCRW brush border membrane (BBM) than does 

20 WT Cry3Bb, suggesting that much of the observed binding is non-productive. It was rea- 
soned that the non-specific binding to WCRW BBM occurs via glycosylated proteins, and 
that binding to both glycosylated-BSA and WCRW BBM is non-productive in reaction 
pathway to toxicity. Therefore reduction or elimination of that binding would lead to en- 
hanced binding to the productive receptor and to enhanced toxicity. Potential binding sites 

25 for sugar groups were targeted for redesign to reduce the nonspecific binding of Cry3Bb to 
glycoproteins and to immobilized WCRW BBM. 

5.17.1 Cry3Bb.60 
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Cry3Bb-60, in which Cry3Bb has been cleaved at R159 in lot3,4, shows decreased 
binding to glycosylated-BSA and decreased binding to immobilized WCRW BBM Cry3Bb- 
60 shows a 3.6-fold increase in bioactivity relative to WT Cry3Bb. 

5 5.17.2 Alterations to la3,4 (Cry3Bb.11221) 

Cry3Bb.l 1221 has been redesigned in the la3,4 region of domain 1, which is the re- 
gion in which Cry3Bb is cleaved to produce Cry3Bb-60. Cry3Bb.ll221 also shows de- 
creased binding to both glycosylated-BSA and immobilized WCRW BBM, and exhibits a 
6.4-fold increase in bioactivity over that of WT Cry3Bb. Together with data for Cry3Bb.60 
10 (section 5.17.1) these data suggest that this loop region contributes substantially to non- 
productive binding of the toxin. 

5.17.3 Alteration to ipi,a8 (Cry3Bb.1 1228,1 1230,1 1237 and 11231) 

The lpl,a8 region of Cry3Bb has been re-engineered to increase hydration (section 
15 4.2.4) and enhance flexibility (section 4.4.3). Several proteins altered in this region, 
Cry3Bb. 11 228,1 1230, and 1 1237 demonstrate substantially lower levels of binding both gly- 
cosylated-BSA and immobilized WCRW BBM, and also show between 4.1- and 4.5-fold in- 
creases in bioactivity relative to WT Cry3Bb. 

20 5.17.4 Binding Activity 

The tendencies of Cry3Bb and some of its derivatives to bind to glycosylated-BSA 
and to WCRW BBM were determined using a BIAcore™ surface plasmon resonance biosen- 
sor. For glycosylated-BSA binding, the glycosylated protein was immobilized using stan- 
dard NHS chemistry to a CM5 chip (BIAcore), and the solubilized toxin was injected over 
25 the glycosylated-BSA surface. To measure binding to WCRW BBM, brush border mem- 
brane vesicles (BBMV) purified from WCRW midguts (English et al, 1991) were immobi- 
lized on an HPA chip (BIAcore) then washed with either lOmM KOH or with 40mM p- 
octylglucoside. The solubilized toxin was then injected over the resulting hybrid bilayer sur- 
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face to detect binding. Protein concentration were determined by Protein Dye Reagent assay 
(BioRad) or BCA Protein Assay (Pierce). 

Other methods may also be used to determine the same binding information. These include, 
but are not limited to, ligand blot experiments using labeled toxin, labeled glycosylated pro- 
5 tein, or anti-toxin antibodies, affinity chromatography, and in vitro binding of toxin to intact 
BBMV. 

5.18 Example 18 - Construction of Plasmids With WT cry3Bb Sequences 
Standard recombinant DNA procedures were performed essentially as described 

1 0 by Sambrook et al , ( 1 989). 

5.18.1 PEG1701 

pEG1701 (FIG. 11), contained in EG11204 and EG11037, was constructed by 
inserting the Sphl-Pstl fragment containing the crySBh gene and the cry IF terminator from 
15 pEG91 1 (Baum, 1994) into the Sphh Pstl site of pEG854.9 (Baum et al, 1996), a high copy 
number B. thuringiensis - £. coli shuttle vector. 

5.18.2 PEG1028 

pEG1028 contains the Hindlll fragment of cry3Bb from pEG1701 cloned into the 
20 multiple cloning site of pTZ18U at Hindlll. 

5.19 Example 19 - Construction of Plasmids with Altered cry3Bb Genes 
Plasmid DNA from E. coli was prepared by the alkaline lysis method (Maniatis et 

aL, 1982) or by commercial plasmid preparation kits (examples: PERFECTprep™ kit, 5 
25 Prime - 3 Prime, Inc., Boulder CO; QIAGEN plasmid prep kit, QIAGEN Inc.). B 
thuringiensis plasmids were prepared from cultures grown in brain heart infusion plus 0.5% 
glycerol (BHIG) to mid logarithmic phase by the alkaline lysis method. When necessary for 
purification, DNA fragments were excised from an agarose gel following electrophoresis and 
recovered by glass milk using a Geneclean II® kit (BIO 101 Inc., La Jolla, CA). Alteration of 
30 the cryiBb gene was accomplished using several techniques including site-directed 
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mutagenesis, triplex PCR™, quasi-random PCR™ mutagenesis, DNA shuffling and standard 
recombinant techniques. These techniques are described in Sections 6.1 5 6.2, 6.3, 6.4 and 
6.5, respectively. The DNA sequences of primers used are listed in Section 7. 

5 5.20 Example 20 - Site-Directed Mutagenesis 

Site-directed mutagenesis was conducted by the protocols established by Kunkle 
(1985) and Kunkle et al (1987) using the Muta-Gene™ Ml 3 in vitro mutagenesis kit (Bio- 
Rad, Richmond, CA). Combinations of alterations to cry3Bb were accomplished by using 
the Muta-Gene™ kit and multiple mutagenic oligonucleotide primers. 

10 

5.20.1 PEG1041 

pEG1041, contained in EG11032, was constructed using the Muta-Gene™ kit, 
primer C, and single-stranded pEG1028 as the DNA template. The resulting altered cry3Bb 
DNA sequence was excised as a PflMl DNA fragment and used to replace the corresponding 
1 5 DNA fragment in pEG 1701.. 

5.20.2 PEG1046 

pEG1046, contained in EG11035, was constructed using the Muta-Gene™ kit, 
primer D, and single-stranded pEG1028 as the DNA template. The resulting altered cry3Bb 
20 DNA sequence was excised as a PJMl DNA fragment and used to replace the corresponding 
DNA fragment in pEG1701. 

5.20.3 PEG1047 

pEG1047, contained in EG 11036, was constructed using the Muta-Gene™ kit, 
25 primer E, and single-stranded pEG1028 as the DNA template. The resulting altered cry3Bb 
DNA sequence was excised as a P/7MI DNA fragment and used to replace the corresponding 
DNA fragment in pEG1701. 
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5.20.4 PEG1052 

pEG1052, contained in EG 11046, was constructed using the Muta-Gene™ kit, 
primers D and E, and single-stranded pEG1028 as the DNA template. The resulting altered 
cry3Bb DNA sequence was excised as a P/IMl DNA fragment and used to replace the corre- 
5 sponding DNA fragment in pEG 1 70 1 . 

5.20.5 PEG1054 

pEG1054, contained in EG11048, was constructed using the Muta-Gene™ kit, 
primer F, and single-stranded pEG1028 as the DNA template. The resulting altered crylBb 
10 DNA sequence was excised as a PJIMl DNA fragment and used to replace the corresponding 
DNA fragment in pEGl 701 . 

5.20.6 PEG 1057 

pEG1057, contained in EG 11051, was constructed using the Muta-Gene™ kit, 
15 primer G, and single-stranded pEG1028 as the DNA template. The resulting altered cry3Bb 
DNA sequence was excised as a PJIMl DNA fragment and used to replace the corresponding 
DNA fragment in pEG1701. 

5.21 Example 21 - Triplex PCR™ 

20 Triplex PCR™ is described by Michael (1994). This method makes use of a 

thermostable ligase to incorporate a phosphorylated mutagenic primer into an amplified DNA 
fragment during PCR™. PCR™ was performed on a Perkin Elmer Cetus DNA Thermal Cy- 
cler (Perkin-Elmer, Norwalk, CT) using a AmpliTaq™ DNA polymerase kit (Perkin-Elmer) 
and S/?M-linearized pEG1701 as the template DNA. PCR™ products were cleaned using 

25 commercial kits such as Wizard™ PCR™ Preps (Promega, Madison, WI) and QIAquick 
PCR™ Purification kit (QIAGEN Inc., Chatsworth, CA). 

5.21.1 PEG1708 and PEG1709 

pEG1708 and pEG1709, contained in EG1 1222 and EG1 1223, respectively, were 
30 constructed by replacing the PflM-PflM fragment of cry3Bb in pEG1701 with PflM- 
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digested and gel purified PCR™ fragment altered at cry3Bb nucleotide positions 688-690, 
encoding amino acid Y230. Random mutations were introduced into the Y230 codon by tri- 
plex PCR™. Mutagenic primer MVT095 was phosphorylated and used together with outside 
primer pair FW001 and FW006. Primer MVT095 also contains a silent mutation at position 
5 687, changing T to C, which, upon incorporation, introduces an additional EcoRI site into 
pEG1701. 

5.21.2 PEG1710, PEG1711 and PEG1712 

Plasmids pEG1710, pEG1711 and pEG1712, contained in EG11224, EG11225 
10 and EG 11 226, respectively, were created by replacing the PflM-PflM fragment of the 
crySBb gene in pEG1701 with P/7M-digested and gel purified PCR™ fragment altered at 
cry3Bb nucleotide positions 690-692, encoding H231. Random mutations were introduced 
into the H23 1 codon by triplex PCR™. Mutagenic primer MVT097 was phosphorylated and 
used together with outside primer pair FW001 and FW006. Primer MVT097 also contains a 
15 T to C sequence change at position 687 which, upon incorporation, results in an additional 
EcoRI site by silent mutation. 

5.21.3 PEG1713 AND PEG1727 

pEG1713 and pEG1727, contained in EG1 1227 and EG1 1242, respectively, were 
20 constructed by replacing the PflM-PflM fragment of the cry3Bb gene in pEG1701 with 
/y7M-digested and gel purified PCR™ fragment altered at cry3Bb nucleotide positions 868- 
870, encoding amino acid R290. Triplex PCR™ was used to introduce random changes into 
the R290 codon. The mutagenic primer, MVT091, was designed so that the nucleotide sub- 
stitutions would result in approximately 36% of the sequences encoding amino acids D or E. 
25 MVT091 was phosphorylated and used together with outside primer pair FW001 and 
FW006. 

5.22 Example 22 - Quasi-Random PCR™ Mutagenesis 

Quasi-random mutagenesis combines the mutagenic PCR™ techniques described 
30 by Vallette et aL (1989), Tomic et al (1990) and LaBean and Kauffinan (1993). Mutagenic 
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primers, sometimes over 70 nucleotides in length, were designed to introduce changes over 
nucleotide positions encoding for an entire structural region, such as a loop. Degenerate 
codons typically consisted of a ratio of 82% WT nucleotide plus 6% each of the other 3 nu- 
cleotides per position to semi-randomly introduce changes over the target region (LaBean 
5 and Kauffman, 1993). When possible, natural restriction sites were utilized; class 2s en- 
zymes were used when natural sites were not convenient (Stemmer and Morris, 1992, list 
additional restriction enzymes useful to this technique). PCR™ was performed on a Perkin 
Elmer Cetus DNA Thermal Cycler (Perkin-Elmer, Norwalk, CT) using a AmpliTaq™ DNA 
polymerase kit (Perkin-Elmer) and SpW-linearized pEG1701 as the template DNA. Quasi- 

10 random PCR™ amplification was performed using the following conditions: denaturation at 
94°C for 1.5 min.; annealing at 50°C for 2 min. and extension at 72°C for 3 min., for 30 cy- 
cles. The final 14 extension cycles were extended an additional 25 s per cycle. Primers con- 
centration was 20 \iM per reaction or 40 pM for long, mutagenic primers. PCR™ products 
were cleaned using commercial kits such as Wizard™ PCR™ Preps (Promega, Madison, Wl) 

15 and QIAquick PCR™ Purification kit (QIAGEN Inc., Chatsworth, CA). In some instances 
PCR™ products were treated with Klenow Fragment (Promega) following the manufacturer's 
instructions to fill in any single base overhangs prior to restriction digestion. 

5,22.1 PEG1707 

20 EG 1707, contained in EG 11 221, was constructed by replacing the PflM-PflMl 

fragment of the cry3Bb gene in pEG1701 with P/7M-digested and gel purified PCR™ frag- 
ment altered at crySBb nucleotide positions 460-480, encoding la3,4 amino acids 154-160. 
Primer MVT075, which includes a recognition site for the class 2s restriction enzyme Bsal, 
and primer FW006 were used to introduce changes into this region by quasi-random 

25 mutagenesis. Primers MVT076, also containing a Bsal site, and primer FW001 were used to 
PCR™ amplify a "linker" fragment. Following PCR™ amplification, both products were 
cleaned, end-filled, digested with Bsal and ligated to each other. Ligated fragment was gel 
purified and used as template for PCR™ amplification using primer pair FW001 and FW006. 
PCR™ product was cleaned, digested with P/7MI, gel purified and ligated into PflMU 

30 digested and purified pEGl 70 1 vector DNA. 
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5.22.2 PEG 1720 and PEG1726 

pEG1720 and pEG1726, contained in EG1 1234 and EG 1 1241, respectively, were 
constructed by replacing the PflMl-PflMi fragment of the cry3Bb gene in pEG1701 with 
5 P/7M-digested and gel purified PCR™ fragment altered at cry3Bb nucleotide positions 859- 
885, encoding Ia7,pi amino acids 287-295. Quasi-random PCR™ mutagenesis was used to 
introduce changes into this region. Mutagenic primer MVT1 11, designed with a Bsal site, 
and primer FW006 were used to introduce the changes. Primer pair MVT094, also contain- 
ing a Bsal site, and FW001 were used to amplify the linker fragment. The PCR™ products 
10 were digested with Bsal, gel purified then ligated to each other. Ligated product was PCR™ 
amplified using primer pair FW001 and FW006, digested with PflMl. 

5.22.3 PEG1714, PEG1715, PEG1716, PEG1718, PEG1719, PEG1722, PEG1723, 

PEG1724 and PEG1725 

15 pEG1714, pEG1715, pEG1716, pEG1718, pEG1719, pEG1722, pEG1723, 

pEG1724 and pEG1725, contained in EG11228, EG11229, EG11230, EG11232, EG11233, 
EG11236, EG11237, EG11238 and EG11239, respectively, were constructed by replacing 
the PflMl-PflMl fragment of the cry3Bb gene in pEG1701 with /y7M-digested and gel puri- 
fied PCR™ fragment altered at cry3Bb nucleotide positions 931-954, encoding lpl,a8 amino 

20 acids 311-318. Quasi-random PCR™ mutagenesis was used to introduce changes into this 
region using mutagenic primer MVT103 and primer FW006. Primers FW001 and FW006 
were used to amplify a linker fragment. The PCR™ products were end-filled using Klenow 
and digested with BamHL The larger fragment from the FW001-FW006 digest was gel pu- 
rified then ligated to the digested MVT103-FW006 fragment. Ligated product was gel puri- 

25 fied and amplified by PCR™ using primer pair FW001 and FW006. The amplified product 
was digested with PflMl and gel purified prior to ligation into P/7MI-digested and purified 
pEG1701 vector DNA. 
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5.22.4 PEG1701.LP2.3 

Plasmids carrying alterations of cry3Bb WT sequence at nucleotides 1051-1065, 
encoding structural region lp2 ? 3 of Cry3Bb, were constructed by replacing the Mlul-Spel 
fragment of pEG1701 with isolated Mlul- and 5/?eI-digested PCR™ product. The PCR™ 
5 product was generated by quasi-random PCR™ mutagenesis were mutagenic primer 
MVT081 was paired with FW006. These plasmids as a group are designated pEGl 701 .102,3. 

5.22.5 pEG1701.lP6,7 

Plasmids containing mutations of the cry3Bb WT sequence at nucleotides 1234- 
1248, encoding structural region 1(56,7 of Cry3Bb, were constructed by replacing the Mlul- 
Spel fragment of pEG1701 with isolated Mlul- and S/?d-digested PCR™ product. The 
PCR™ product was generated by quasi-random PCR™ mutagenesis where mutagenic primer 
MVT085 was paired with primer WD1 15. Primer pair MVT089 and WD1 12 were used to 
amplify a linker fragment. Both PCR™ products were digested with Taql and ligated to each 
other. The ligation product was gel purified and PCR™ amplified using primer pair 
MVT089 and FW006. The amplified product was digested with Mlul and Spel and ligated 
into Mlul and Spel digested and purified pEG1701 vector DNA. These plasmids as a group 
are designated pEG1701.1p6,7. 

20 5.22.6 PEG1701.LP10,H 

Plasmids containing mutated cry3Bb sequences at nucleotides 1450-1467, encod- 
ing structural region 1(510,11 of Cry3Bb, were constructed by replacing the Spel-Pstl frag- 
ment of pEG1701 with isolated Spel- and Pjrt-digested PCR™ product. The PCR™ product 
was generated by quasi-random PCR™ mutagenesis where mutagenic primer MVT105 was 
25 paired with primer MVT070. Primer pair MVT092 and MVT083 were used to generate a 
linker fragment. (MVT083 is a mutagenic oligo designed for another region. The sequence 
changes introduced by MVT083 are removed following restriction digestion and do not im- 
pact the alteration of cry 3 Bb in the 1010,11 region.) Both PCR™ products were digested 
with Bsal, ligated together, and the ligation product PCR™ amplified with primer pair 
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MVT083 and MVT070. The resulting PGR™ product was digested with Spel and Pstl, and 
gel purified. These plasmids as a group are designated pEG 170 Lip 10,1 1. 

5.23 Example 23 DNA Shuffling 

5 DNA-shuffling, as described by Stemmer (1994), was used to combine individual 

alterations in the crySBb gene. 

5.23-1 PEG1084, PEG1085, PEG1086 and PEG1087 

pEG1084, pEG1085, pEG1086, and pEG1087, contained in EG11081, EG1 1082, 
10 EG11083, and EG11084, respectively, were recovered from DNA-shuffling. Briefly, PjM\ 
DNA fragments were generated using primer set A and B and each of the plasmids pEG1707, 
pEG1714, pEG1715, pEG1716, pEG1041, pEG1046, pEG1047, and pEG1054 as DNA 
templates. The resulting DNA fragments were pooled in equal-molar amounts and digested 
with DNasel and 50-100 bp DNA fragments were recovered from an agarose gel by three 
15 successive freeze-thaw cycles: three min in a dry-ice ethanol bath followed by complete 
thawing at 50°C. The recovered DNA fragments were assembled by primerless-PCR™ and 
PCR™-amplified using the primer set A and B as described by Stemmer (1994). The final 
PCR™-amplified DNA fragments were cut with P/IMl and used to replace the corresponding 
cryiBb P/IMl DNA fragment in pEG1701. 

20 

5.24 Example 24 - Recombinant DNA Techniques 

Standard recombinant DNA procedures were performed essentially as described 
by Sambrook etal (1989). 

25 5.24.1 PEG1717 

pEG1717, contained in EG 11231, was constructed by replacing the small BgUl 
fragment of pEG1710 with the small BgR\ fragment from pEG1714. 
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5.24.2 PEG1721 

pEG1721, contained in EG11235, was constructed by replacing the small Bgttl 
fragment from pEG1710 with the small Bglll fragment from pEG1087. 

5 5.24.3 PEG1063 

pEG1062, contained in EG11057, was constructed by replacing the Ncol DNA 
fragment containing oh 43 from pEG1054 with the isolated Ncol DNA fragment containing 
oh 43 and the alterations in cry3Bb from pEG1046. 

10 5.24.4 PEG1063 

pEG1063, contained in EG 11058, was constructed by replacing the Ncol DNA 
fragment containing oh 43 from pEG1054 with the isolated Ncol DNA fragment containing 
oh 43 and the alterations in crySBb from pEGl 707. 

15 5.24.5 PEG1095 

pEG1095, contained in EG11095, was constructed by replacing the Mlul-Spel 
DNA fragment in pEG1701 with the corresponding Mlul-Spel DNA fragment from 
pEG1086. 

20 5.25 Example 25 - Primers Utilized in Constructing Cry3Bb* Variants 

Shown below are the primers used for site-directed mutagenesis, triplex PCR™ 
and quasi-random PCR™ to prepare the cry3Bb* variants as described above. Primers were 
obtained from Ransom Hill Bioscience, Inc. (Ramona, CA) and Integrated DNA Technolo- 
gies, Inc. (Coralville, IA). The specific composition of the primers containing particular de- 

25 generacies at one or more residues is given in Section 5.30, Example 30. 

5.25.1 Primer FW001 (SEQ ID NO:71): 
5'-AGACAACTCTACAGTAAAAGATG-3' 
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5.25.2 Primer FW006 (SEQ ID NO:72): 
5'-GGTAATTGGTCAATAGAATC-3' 

5.25.3 Primer MVT095 (SEQ ID NO:73): 

5 5'-CAGAAGATGTTGCTGAATTCNNNCATAGACAATTAAAAC-3' 

5.25.4 Primer MVT097 (SEQ ID NO.-74): 

5'-GATGTTGCTGAATTCTATNNNAGACAATTAAAAC-3' 

10 5.25.5 Primer MVT091 (SEQ ID NO:75): 

5'-CCCATTTTATGATATTBDNTTATACTCAAAAGG-3' 

5.25.6 Primer MVT075 (SEQ ID NO:76): 

5'-AGCTATGCTGGTCTCGGAAGAAA£FNFFNFJNJFJFJNFiNJFJAAAAGAAGCCAAGATCGAAT-3' 

15 

5.25.7 Primer MVT076 (SEQ ID NO:77): 

5'-GGTCACCTAGGTCTCTCTTCCAGGAATTTAACGCATTAAC-3' 

5.25.8 Primer MVT1 1 1 (SEQ ID NO:78): 

20 5'-AGCTATGCTGGTCTCCCATTTJEHIEJEJJEIIKRRJEHEIJEENIIIGTTAAAACAGAACTAAC-3' 

5.25.9 Primer MVT094 (SEQ ID NO:79): 

5 '- ATCC AGTGGGGTCTC AAATGGGAAAAGTAC AATTAG-3 ' 

25 5.25.10 Primer MVT103 (SEQ ID NO:80): 

5'-CATTTTTACGGATCCAATTTTTJFFFJNEEJEFNFJ7^FEILEIJEOGGACCAACTTTTTTGAG-3' 

5.25.11 Primer MVT081 (SEQ ID NO:81): 

5'<5AATTTCATACGCGTCrTCAACCTGGTJEHJiJIINMEEIEJTCTTTCAATTATTGGTCTGG-3' 

30 
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5.25.12 Primer MVT085 (SEQ ID NO:82): 

5'-AAAAGTTTATCOAACTATAGCTAATACAGACGTAGCGGCTJQOFFNEEJnJEElCTAT.ATTTAGGTGTTACG-3' 



5.25.13 Primer A (SEQ ID NO:83) 3b2pflm1: 

5 5'-GGAGTTCCATTTGCTGGGGC-3' 

5.25.14 Primer B (SEQ ID NO.-84) 3b2pflm2: 

5'-ATCTCCATAAAATGGGG-3' 

10 5.25.15 Primer C (SEQ ID NO:85)3b2165DG: 

5'-GCGAAGTAAAAGAAGCCAAGGTCGAATAAGGG-3' 

5.25.16 Primer D (SEQ ID NO:86) 3B2160SKRD: 

5 '-CCTTTAAGTTTGC G AAATCC AC AC AGCC AAGGTCG AAT AAGGG-3 ' 

15 

5.25.17 Primer E (SEQ ID NO:87) 3b2290VP: 

5'-CCCATTTTATGATGTTCGGTTATACCCAAAAGGGG-3' 

5.25.18 Primer F (SEQ ID NO:88) 3b2EdA104: 

20 5'-GGCCAAGTGAAGACCCATGGAAGGC-3' 

5.25.19 Primer G (SEQ ID NO:89) 3b2KG189: 

5'-GCAGTTTCCGGATTCGAAGTGC-3' 

25 5.25.20 Primer WD1 12 (SEQ ID NO:90): 
5'-CCGCTACGTCTGTATTA-3' 

5.25.21 Primer WD1 15 (SEQ ID NO:91): 

5'-ATAATGGAAGCACCTGA-3' 

30 
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5.25.22 Primer MVT105 (SEQ ID NO:92): 

5'-AGCTATGCTGGTCTCTTCTTAEJIFEIlEFFIJFIJIINACAATTCCATTTTTTACTTGG-3' 



5.25.23 Primer MVT092 (SEQ ID NO:93): 

5 5'-ATCCAGTTGGGTCTCTAAGAAACAAACCGCGTAATTAAGC-3' 

5.25.24 Primer MVT070 (SEQ ID NO:94): 

5 '-CCTC AAGGGTTATAAC ATC C -3 ' 

10 5.25.25 PrimerMVT083 (SEQIDNO:95): 

5'-GTACAAAAGCTAAGCTTTIEJIINPEEMEEIJNJESCGAACTATAGCTAATACAG-3' 

5.26 Example 26 - Sequence Analysis of Altered cry3Bb Genes 

E. coli DH5oc™ (GIBCO BRL, Gaithersburg, MD), JMilO and Sure™ 

15 (Stratagene, La Jolla, CA) cells were sometimes used amplify plasmid DNA for sequencing. 
Plasmids were transformed into these cells using the manufacturers' procedures. DNA was 
sequenced using the Sequenase® 2.0 DNA sequencing kit purchased from U. S. Biochemical 
Corporation (Cleveland, Ohio). The plasmids described in Section 6, their respective diver- 
gence from WT cry3Bb sequence, the resulting amino acid changes and the protein structure 

20 site of the changes are listed in Table 1 1 . 
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5.27 Example 27 » Expression of Cry3Bb* Proteins 
5.27.1 Culture Conditions 

LB agar was prepared using a standard formula (Maniatis et at., 1982). Starch 
agar was obtained from Difco Laboratories (Detroit, MI) and supplemented with an addi- 
5 tional 5 g/1 of agar. C2 liquid medium is described by Donovan et ai (1988). C2 medium 
was sometimes prepared without the phosphate buffer (C2-P). All cultures were incubated at 
25°C to 30°C; liquid cultures were also shaken at 250 rpm, until sporulation and lysis had 
occurred. 

10 5.27.2 Transformation Conditions 

pEG1701 and derivatives thereof were introduced into acrystalliferious 
B. thuringiensis var. kurstaki EG7566 (Baum, 1994) or EG10368 (U. S. Patent 5,322,687) by 
the electro poration method of Macaluso and Mettus (1991). In some cases, the method was 
modified as follows to maximize the number of transformants. The recipient B. thuringiensis 

1 5 strain was inoculated from overnight growth at 30°C on LB agar into brain heart infusion 
plus 0.5% glycerol, grown to an optical density of approximately 0.5 at 600 nm, chilled on 
ice for 10 min, washed 2X with EB and resuspended in a 1/50 volume of EB. Transformed 
cells were selected on LB agar or starch agar plus 5 ptg/ml chloramphenicol. Visual screen- 
ing of colonies was used to identify transformants producing crystalline protein; those colo- 

20 nies were generally more opaque than colonies that did not produce crystalline protein. 

5.273 Strain and Protein Designations 

A transformant containing an altered cry3Bb* gene encoding an altered Cry3Bb* 
protein is designated by an "EG" number, e.g., EG11231. The altered Cry3Bb* protein is 
25 designated Cry3Bb followed by the strain number, e.g., Cry3Bb.l 1231. Collections of pro- 
teins with alterations at a structural site are designated Cry3Bb followed by the structural site, 
e.g., Cry3Bb.l(J2,3. Table 12 lists the plasmids pertinent to this invention, the new 
B. thuringiensis strains containing the plasmids, the acrystalliferous B. thuringiensis recipient 
strain used, and the proteins produced by the new strains. 
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5.28 Example 28 - Generation and Characterization of Cry3Bb-60 
5.28.1 Generation of Cry3Bb-60 

Cry3Bb-producing strain EG7231 (U. S. Patent 5,187,091) was grown in C2 me- 
5 dium plus 3 mg/ml chloramphenicol. Following sporulation and lysis, the culture was 
washed with water and Cry3Bb protein purified by the NaBr solubilization and recrystalliza- 
tion method of Cody et al. (1992). Protein concentration was determined by BCA Protein 
Assay (Pierce, Rockford, IL). Recry stall ized protein was solubilized in 10 ml of 50 mM 
KOH per 100 mg of Cry3Bb protein and buffered to pH 9.0 with 100 mM CAPS (3- 

10 [cyclohexylamino]-l-propanesulfonic acid), pH 9.0. The soluble toxin was treated with 
trypsin at a weight ratio of 50 mg toxin to 1 mg trypsin for 20 min to overnight at room tem- 
perature. Trypsin cleaves proteins on the carboxyl side of available arginine and lysine resi- 
dues. For 8-dose bioassay, the solubilization conditions were altered slightly to increase the 
concentration of protein: 50 mM KOH was added dropwise to 2.7 ml of a 12.77 mg/ml sus- 

15 pension of purified Cry3Bb* until crystal solubilization occurred. The volume was then ad- 
justed to 7 ml with 100 mM CAPS, pH 9.0. 
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Table 12 

Plasmids Carrying Altered cry3Bb* Genes Transformed into B. thuringiensis 
for Expression of Altered Cry3Bb* Proteins 



Plasmid Designation 



New BT Strain 



Expressed Protein 



pEG1701 
pEG1701 
pEG1707 
pEG1708 
P EG1709 
PEG1710 
pEG1711 
pEG1712 
pEG1713 
pEG1714 
pEG171S 
pEG1716 
pEG1717 
pEG1718 
pEG1719 
pEG1720 
pEG1721 
pEG1722 
pEG1723 
pEG1724 
pEG1725 
pEG1726 
pEG1727 
pEG1041 
pEG1046 



EG11204 
EG11037 
EG11221 
EG 11222 
EG 11 223 
EG 11 224 
EG11225 
EG11226 
EG11227 
EG11228 
EG 11 229 
EG11230 
EG11231 
EG 1 1232 
EG11233 
EG11234 
EG11235 
EG11236 
EG11237 
EG11238 
EG11239 
EG11241 
EG 11 242 
EG11032 
EG11035 



WT Cry3Bb 
WT Cry3Bb 
Cry3Bb. 11221 
Cry3Bb.ll222 
Cry3Bb. 11223 
Cry3Bb. 11224 
Cry3Bb. 11225 
Cry3Bb.ll226 
Cry3Bb. 11227 
Cry3Bb.ll228 
Cry3Bb. 11229 
Cry3Bb. 11230 
Cry3Bb.ll231 
Cry3Bb.ll232 
Cry3Bb.ll233 
Cry3Bb.ll234 
Cry3Bb. 11235 
Cry3Bb. 11236 
Cry3Bb.ll237 
Cry3Bb. 11238 
Cry3Bb. 11239 
Cry3Bb. 11241 
Cry3Bb. 11242 
Cry3Bb. 11032 
Cry3Bb. 11035 
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Plasmid Designation 


New BT Strain 


Expressed Protein 


pEG1047 


EG11036 


Cry3Bb. 11036 


pEG1052 


EG 11046 


Cry3Bb. 11046 


pEG1054 


EG 11048 


Cry3Bb. 11048 


pEG1057 


EG11051 


Cry3Bb. 11051 


pEG1062 


EG11057 


Cry3Bb. 11057 


pEG1063 


EG11058 


Cry3Bb.ll058 


pEG1084 


EG11081 


Cry3Bb. 11081 


pEG1085 


EG 11082 


Cry3Bb. 11082 


pEG1086 


EG 11083 


Cry3Bb. 11083 


pEG1087 


EG 11084 


Cry3Bb. 11084 


pEG1095 


EG 11095 


Cry3Bb. 11095 


pEG1098 


EG11098 


Cry3Bb. 11098 


pEG1701.1(32,3 


collection of unnamed strains 


Cry3Bb.lp2,3 


pEG1701.ip6,7 


collection of unnamed strains 


Cry3Bb.lp6,7 


pEG1701.ipiO,ll 


collection of unnamed strains 


Cry3Bb.lpl0.il 



5.28.2 Determination of Molecular Weight of Cry3Bb-60 

The molecular weight of the predominant trypsin digestion fragment of Cry3Bb 
was determined to be 60 kDa by SDS-polyacrylamide gel electrophoresis (SDS-PAGE) 
5 analysis using commercial molecular weight markers. This digestion fragment is designated 
Cry3Bb-60. No further digestion of the 60 kDa cleavage product was observed. 

5.28.3 Determination of NH 2 -Terminus of Cry3Bb-60 

To determine the NH 2 -terminal sequence of Cry3Bb-60, the trypsin digest was 
10 fractionated by SDS-PAGE and transferred to Immobilon™-P membrane (Millipore Corpo- 
ration, Bedford, MA) following standard western blotting procedures. After transfer, the 
membrane was rinsed twice with water then stained with 0.025% Coomassie Brilliant Blue 
R-250 plus 40% methanol for 5 min, destained with 50% methanol and rinsed in water. The 
Cry3Bb.60 band was excised with a razor blade. NH 2 -terminal sequencing was performed at 
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the Tufts Medical School, Department of Physiology (Boston, MA) using standard automated 
Edman degradation procedures. The NH 2 -terminal amino acid sequence was determined to 
be SKRSQDR (SEQ ID NO:96), corresponding to amino acids 160-166 of Cry3Bb. Trypsin 
digestion occurred on the carboxyl side of amino acid R159 resulting in the removal of heli- 
5 cesl-3. 

5.29 Example 29 - Bioactivity of Cry3Bb* Proteins 

5.29. 1 Culture Conditions and Protein Concentration Determination 

Cultures for 1 -dose bioassays were grown in C2-P plus 5 jig/ml chloramphenicol 
10 (C2-P/cm5) then diluted with 3 volumes of 0.005% Triton X-100®. The protein concentra- 
tions of these cultures were not determined. Cultures for 8-dose bioassays were grown in 
C2/cm5, washed 1 - 2 times with 1 - 2 volumes of sterile water and resuspended in 1/10 vol- 
ume of sterile 0.005% Triton X-100®. The toxin protein concentration of each concentrate 
was determined as described by Brussock and Currier (1990), omitting the treatment with 3 
15 M HEPES. The protein concentration was adjusted to 3.2 mg/ml in 0.005% Triton X-100® 
for the top dose of the assay. Cry3Bb.60 was produced and quantified for 8-dose assay as 
described in Section 9.1. 

5.29.2 Insect Bioassays 

20 Diabrotica undecimpunctata howardi Barber (southern corn rootworm or SCRW) 

and Diabrotica virgifera virgifiera LeConte (western corn rootworm or WCRW) larvae were 
reared as described by Slaney et ai (1992). Eight-dose assays and probit analyses were per- 
formed as described by Slaney et ai (1992). Thirty-two larvae were tested per dose at 50 fxl 
of sample per well of diet (surface area of 175 mm 2 ). Positive controls were WT Cry3Bb- 

25 producing strains EG1 1037 or EG1 1204. All bioassays were performed using 128-well trays 
containing approximately 1 ml of diet per well with perforated mylar sheet covers (C-D In- 
ternational Inc., Pitman, NJ). One-dose assays were performed essentially the same except 
only 1 dose was tested per strain. All assay were replicated at least twice. 
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5.29.3 Insect Bioassay Results: 1-Dose assays Against SCRW 

Results from 1-dose assays are expressed as the relative mortality (RM) of the ex- 
perimental strain compared to WT (% mortality of experimental culture divided by % mor- 
tality of WT culture). Altered and improved Cry3Bb proteins derived from plasmids con- 
5 structed using PCR™ methods introducing random or semi-random changes into the cry3Bb 
gene sequence were distinguished from other altered but not improved Cry3Bb proteins by 
replicated, 1-dose assay against SCRW larvae. Those proteins showing increased activity 
(defined as RM > 1.5) compared to WT Cry3Bb or, in the case of proteins with combinations 
of altered sites, compared to a "parental" altered Cry3Bb protein were further characterized 
10 by 8-dose assay. The overall RM "pattern" produced by 1-dose assay results from a collec- 
tion of proteins carrying random or semi-random alterations within a single structural region, 
e.g., in ip2,3, can be used to determine if that structural region is important for bioactivity. 
Retention of WT levels of activity (RM * 1) indicate changes are tolerated in that region. 
Overall loss of activity (RM < 1 ) distinguishes the region as important for bioactivity. 

15 

5.29.4 Cry3Bb.lP2>3: Results of 1-Dose Bioassays Against SCRW 
Cry3Bb.ip2.3 protein are a collection of proteins altered in the 102,3 region of 

Cry3Bb (see Section 5.3.4). Typical results of 1-dose assays of these altered proteins are 
shown in FIG. 12. The RM values for Cry3Bb.ip2,3 proteins are less than 1, with a few 
20 exceptions of values close to 1 , indicating that this region is important for toxicity. 

5.29.5 Cry3Bb,lP6,7: Results of 1-Dose Bioassays Against SCRW 
Cry3Bb.l|36,7 proteins are a collection of proteins altered in the ip6,7 region of 

Cry3Bb (see Section 5.3.5). Typical results of 1-dose assays of these altered proteins are 
25 shown in FIG. 13. With a few exceptions of values close to 1, the RM values for 
Cry3Bb.lp6,7 proteins are less than 1, indicating that this region is important for toxicity. 

5.29.6 Cry3BbxP10,11: Results of 1-Dose Bioassays Against SCRW 
Cry3Bb.ipi0,l 1 proteins are a collection of proteins altered in the ip 10,1 1 region 

30 of Cry3Bb (see Section 5.3.6). Typical results of 1-dose assays of these altered proteins are 
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shown in FIG. 14. With a few exceptions of values close to 1, the RM values for 
Cry3Bb.ipiO,l 1 proteins are less than 1, indicating that this region is important for bioactiv- 
ity. 

5 5.29.7 Insect Bioassay Results: Results of 8-Dose Assays Against SCRW 

Results from 8-dose assays are expressed as an LC 50 value (protein concentration 
giving 50% mortality) with 95% confidence intervals. The LC 50 values with 95% confidence 
intervals of altered Cry3Bb proteins showing improved activities against SCRW larvae and 
LC 50 values of the WT Cry3Bb control determined at the same time are listed in Table 13 
10 along with the fold increase over WT activity for each improved protein. 
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Table 13 

Designed Cry3Bb proteins were tested against SCRW larvae in replicated, 8- 

DOSE ASSAYS TO DETERMINE THE LC 30 VALUES 



LC 50 ng/well (95% C.I.) 



Improved Protein 


Improved Protein 


WT Cry3Bb 
Control 


Fold Increase Over 
WT Activity 


Cry3Bb.60 


6.7 (5.3-8.4) 


24.1 (15-39) 


3.6x 


Cry3Bb. 11221 


3.2 (2.5-4) 


20.5(14.5-29) 


6.4x 


Cry3Bb. 11222 


7.3 (6-9) 


29.4 (23-37) 


4.0x 


Cry3Bb.ll223 


10.5(9-12) 


29.4 (23-37) 


2.8x 


Cry3Bb. 11224 


6.5 (5.1-8.2) 


32.5 (25-43) 


5.0x 


Cry3Bb.ll225 


13.7(11-16.8) 


49.5 (39-65) 


3.6x 


Cry3Bb. 11226 


16.7 (10.6-24.2) 


49.5 (39-65) 


3.0x 


Cry3Bb. 11227 


11.1 (9.1-13.5) 


21.3(16-28) 


1.9x 


Cry3Bb. 11228 


8.0 (6.6-9.8) 


32.9 (25-45) 


4.1x 


Cry3Bb. 11229 


7.2 (5.8-8.8) 


18.2(15-22) 


2.5x 


Cry3Bb. 11230 


7.0 (5.8-8.6) 


32.9 (25-45) 


4.7x 


Cry3Bb. 11231 


3.3 (3.0-3.7) 


26.1 (22-31) 


7.9x 


Cry3Bb. 11232 


6.4 (5.4-7.7) 


32.9 (25-45) 


5.1x 


Cry3Bb. 11233 


15.7(12-20) 


32.9 (25-45) 


2.2x 


Cry3Bb.ll234 


7(6-9) 


29 (22-39) 


4.1x 


Cry3Bb. 11235 


4.2 (3.6-4.9) 


13.3(10-17) 


3.2x 


Cry3Bb. 11236 


11.6 (9-15) 


36.4 (27-49) 


3.1x 


Cry3Bb. 11237 


6.8(4-11) 


36.4 (27-49) 


5.4x 


Cry3Bb. 11238 


13.9(11-17) 


36.4 (27-49) 


2.6x 


Cry3Bb. 11239 


13.0(10-16) 


36.4 (27-49) 


2.8x 


Cry3Bb. 11241 


11 (7-16) 


29 (22-39) 


2.6x 


Cry3Bb. 11242 


11.9(9.2-16) 


30 (23-38) 


2.5x 


Cry3Bb. 11032 


4.2 (3.6-4.9) 


13.3(10-17) 


3.1x 
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LC S0 ng/well (95% C.I.) 



Improved Protein 


Improved Protein 


WT Cry3Bb 
Control 


Fold Increase Over 
WT Activity 


Cry3Bb.ll035 


10.3 (8-13) 


27.9 (23-34) 


2.7x 


Cry3Bb. 11036 


6.5 (5.1-7.9) 


27.9 (23-34) 


4.3x 


Cry3Bb.ll046 


12.1 (8-19) 


31.2 (25-39) 


2.6x 


Cry3Bb.ll048 


8.3 (6-11) 


35.4 (24-53) 


4.3x 


Cry3Bb. 11051 


11.8 (8-16) 


35.4 (24-53) 


3.0x 


Cry3Bb. 11057 


8.8(7-11) 


29.5 (24-36) 


3.4x 


Cry3Bb. 11058 


9.6 (6-14) 


33.4 (27-43) 


3.5x 


Cry3Bb.ll081 


8.5 (7-11) 


51.5 (37-79) 


6.1x 


Cry3Bb. 11082 


10.6 (8-13) 


51.5 (37-79) 


4.9x 


Cry3Bb.ll083 


7.0 (5-10) 


51.5 (37-79) 


7.4x 


Cry3Bb.ll084 


7.2(4-12) 


51.5 (37-79) 


7.2x 


Cry3Bb. 11095 


11.1 (9-14) 


51.5 (37-79) 


4.6x 


Cry3Bb. 11098 









5.29,8 Insect Bioassay Results: 8-Dose Assays Against WCRW 

WCRW larvae are delicate and difficult to work with. Therefore, only some of 
the designed Cry3Bb showing improved activity against SCRW larvae were also tested 
against WCRW larvae in 8-dose assays. The LC 50 determinations for the designed Cry3Bb 
proteins are shown in Table 14 along with the LC 50 values of the WT Cry3Bb control deter- 
mined at the same time. 
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Table 14 

Cry3Bb* Proteins Showing Improved Activity Against SCRW Larvae Also 
Show Improved Activity Against WCRW Larvae 



LC S0 ng/well (95% C.I.) 


Improved Protein 


Improved Protein 


WT Cry3Bb 


Fold Increase Over 






Control 


WT Activity 


EG11083 


6.3 (4.7-8.2) 


63.5 (46-91) 


lO.lx 


EG11230 


24.2 (13-40) 


4.5 (2.1-7.4) 


5.4x 


EG11231 


32.2 (14-67) 


2.5 (1.7-3.6) 


12.9x 



5 5.30 Example 30 - Channel Activity 

Ion channels produced by Cry3Bb and some of its derivatives were measured by 
the methods described by Slatin et ah (1990). In some instances, lipid bilayers were prepared 
from a mixture of 4:1 phophatidylethanolamine (PE) : phosphatidylcholine (PC). Toxin 
protein was solubilized from washed, C2 medium, B. thuringiensis cultures with 12 mM 
10 KOH. Following centrifugation to remove spores and other debris, 10 jig of soluble toxin 
protein was added to the cis compartment (4.5 ml volume) of the membrane chamber. Pro- 
tein concentration was determined using the BCA Protein Assay (Pierce). 

5.30.1 Channel activity of WT Cry3Bb. 

15 Upon exposure to black lipid membranes, Cry3Bb forms ion channels with vari- 

ous conductance states. The channels formed by Cry3Bb are rarely discrete channels with 
well resolved open and closed states and usually require incubation of the toxin with the 
membrane for 30 - 45 min before any channel-like events are observed. After formation of 
the initial conductances, the size increases from approximately 200 pS to over 10,000 pS over 

20 2 - 3 h. Only the small conductances (^ 200 pS) are voltage dependent. Over 200 pS, the 
conductances are completely symmetric. Cry3Bb channels also exhibit (3-mercaptoethanol- 
dependent activation, growing from small channel conductances of -200 pS to several thou- 
sand pS within 2 min of the addition of P-mercaptoethanol to the cis compartment of the 
membrane chamber. 
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5.30.2 Cry3Bb.11032 

The channel activity of Cry3Bb.l 1032 is much like WT Cry3Bb when the solubi- 
lized toxin protein is added to the cis compartment of the membrane chamber. However, 
5 when this protein is artificially incorporated into the membrane by forming or "painting" the 
membrane in the presence of the Cry3Bb. 11032 protein, a 16-fold increase in the initial 
channel conductances is observed (- 4000 pS). This phenomenon is not observed with WT 
Cry3Bb. 

10 5.30.3 CRY3BB.11035 

Upon exposure to artificial membranes, the Cry3Bb. 11035 protein spontaneously 
forms channels that grow to large conductances within a relatively short time span (-5 min). 
Conductance values ranges from 3000 - 6000 pS and, like WT Cry3Bb, are voltage depend- 
ent at low conductance values. 

15 

5.30.4 CRY3BB.11048 

The Cry3Bb. 11048 protein is quite different than WT Cry3Bb in that it appears 
not to form channels at all, but, rather, forms symmetrical pores with respect to voltage. 
Once the pore is formed, it remains open and allows a steady conductance ranging from 25 to 
20 130 pS. 

5.30.5 Cry3Bb.11224 and Cry3Bb.11226 

The metal binding site of WT Cry3Bb formed by H231 in the dimer structure was 
removed in proteins Cry3Bb.ll224 and Cry3Bb.ll226. The conductances formed by both 
25 designed proteins are identical to that of WT Cry3Bb with the exception that neither of the 
designed proteins exhibits p-mercaptoethanol-dependent activation. 

5.30.6 Cry3Bb.11221 

Cry3Bb. 11221 protein has been observed to immediately form small channels of 
30 1 00 - 200 pS with limited voltage dependence. Some higher conductances were observed at 
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the negative potential. In other studies, the onset of activity was delayed by 27 min, which is 
more typical for WT Cry3Bb. Unlike WT Cry3Bb, however, Cry3Bb.l 1221 forms well re- 
solved, 600 pS channels with long open states. The protein eventually reaches conductances 
of 7000 pS. 

5 

5.30.7 CRY3BB.11242 

Cry3Bb. 11242 protein forms small conductances immediately upon exposure to 
an artificial membrane. The conductances grow steadily and rapidly to 6000 pS in approxi- 
mately 3 min. Some voltage dependence was noted with a preference for a negative imposed 
10 voltage. 

5.30.8 CRY3BB.11230 

Unlike WT Cry3Bb, Cry3Bb.l 1230 forms well resolved channels with long open 
states that do not continue to grow in conductance with time. The maximum observed chan- 
15 nel conductances reached 3000 pS. FIG. 15 illustrates the difference between the channels 
formed by Cry3Bb and Cry3Bb. 1 1230. 

5.30.9 Cry3Bb.60 

Cry3Bb.60 forms well resolved ion channels within 20 min of exposure to an ar- 
20 tificial membrane. These channels grow in conductance and frequency with time. The be- 
havior of Cry3Bb.60 in a planar lipid bilayer differs from Cry3Bb in two significant ways. 
The conductances created by Cry3Bb.60 form more quickly than Cry3Bb and, unlike 
Cry3Bb, the conductances are stable, having well resolved open and closed states definitive 
of stable ion channels (FIG. 16). 

25 
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Example 31 - 



- Primer Compositions 
Table 15 



SEQ ID NO:83 % of Nucleotide in mixture 



Code 


A T 


G C 


N 


25 25 


25 25 


Table 16 


SEQ ID NO:84 


% of Nucleotide in mixture 


Code 


A T 


G C 


N 


25 25 


25 25 


Table 17 


SEQ ID NO:85 


% of Nucleotide in mixture 


Code 


A T 


G C 


B. 


16 16 


52 16 


D 


70 10 


10 10 


N 


25 25 


25 25 


Table 18 


SEQIDNO:86 


% of Nucleotide in mixture 


Code 


A T 


G C 


E 


82 6 


6 6 


F 


6 6 


6 82 


J 


6 82 


6 6 


I 


6 6 


82 6 


N 


25 25 


25 25 
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Table 19 



SEQ ID NO:88 % of Nucleotide in mixture 



Code 


A 


T 


G 


C 


J 


6 


82 


6 


6 


E 


82 


6 


6 


6 


H 


1 


1 


1 


97 


I 


6 


6 


82 


6 


K 


15 


15 


15 


55 


R 


15 


55 


15 


15 


Table 20 


SEQIDNO:90 


% of Nucleotide in mixture 


Code 


A 


T 


G 


C 


J 


6 


82 


6 


6 


F 


6 


6 


6 


82 


N 


25 


25 


25 


25 


E 


82 


6 


6 


6 


I 


6 


6 


82 


6 


L 


8 


1 


83 


8 


0 


1 


1 


1 


97 
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Table 21 



SEQ ID NO:91 


% of Nucleotide in mixture 


Code 


A 


T 


G 


C 


J 


6 


82 


6 


6 


E 


82 


6 


6 


6 


H 


1 


1 


1 


97 


I 


6 


6 


82 


6 


N 


25 


25 


25 


25 


M 


82 


2 


8 


8 



Table 22 
SEQ ID NO:92 

% of Nucleotide in mixture 



Code 


A 


T 


G 


C 


J 


6 


82 


6 


6 


Q 


0 


9 


82 


9 


F 


6 


6 


6 


82 


N 


25 


25 


25 


25 


E 


82 


6 


6 


6 


I 


6 


6 


82 


6 
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Table 23 
SEQ ID NO:92 



% of Nucleotide in mixture 



Code 


A 


T 


G C 


J 


6 


82 


6 6 


F 


6 


6 


6 82 


N 


25 


25 


25 25 


E 


82 


6 


6 6 


I 


6 


6 


82 6 




Table 24 






SEQ ID NO:95 






% of Nucleotide in mixture 


Code 


A 


T 


G C 


J 


6 


82 


6 6 


N 


25 


25 


25 25 


E 


82 


6 


6 6 


I 


6 


6 


82 6 


M 


82 


2 


8 8 


P 


8 


2 


8 82 


S 


1 


97 


1 1 



5.32 Example 32 - Atomic Coordinates for Cry3Bb 

The atomic coordinates of the Cry3Bb protein are given in the Appendix included 
in Section 9.1 

5.33 Example 33 - Atomic Coordinates for Cry3A 

The atomic coordinates of the Cry3A protein are given in the Appendix included 
in Section 9.2 
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5.34 Example-34 - Modification of Cak Genes for Expression in Plants 

Wild-type cry genes are known to be expressed poorly in plants as a full length 
gene or as a truncated gene. Typically, the G+C content of a cry gene is low (37%) and often 
5 contains many A+T rich regions, potential polyadenylation sites and numerous ATTTA se- 
quences. Table 25 shows a list of potential polyadenylation sequences which should be 
avoided when preparing the "plantized" gene construct. 



A. I35SJ5(2WKV01! DOC) 



-179- 



Table 25 

List Of Sequences Of The Potential Polyadenylation Signals 



AATAAA* 


AAGCAT 


AATAAT* 


ATTAAT 


AACCAA 


ATACAT 


ATATAA 


AAAATA 


AATCAA 


ATTAAA** 


ATACTA 


AATTAA** 


ATAAAA 


AATACA** 


ATGAAA 


CAT AAA** 



* indicates a potential major plant polyadenylation site. 
** indicates a potential minor animal polyadenylation site. 
5 All others are potential minor plant polyadenylation sites. 

The regions for mutagenesis may be selected in the following manner. All re- 
gions of the DNA sequence of the cry gene are identified which contained five or more con- 
secutive base pairs which were A or T. These were ranked in terms of length and highest per- 

10 centage of A+T in the surrounding sequence over a 20-30 base pair region. The DNA is 
analysed for regions which might contain polyadenylation sites or ATTTA sequences. Oli- 
gonucleotides are then designed which maximize the elimination of A+T consecutive regions 
which contained one or more polyadenylation sites or ATTTA sequences. Two potential 
plant polyadenylation sites have been shown to be more critical based on published reports. 

15 Codons are selected which increase G+C content, but do not generate restriction sites for en- 
zymes useful for cloning and assembly of the modified gene (e.g., BamRl, Bglll, Sacl, Ncol, 
£coRV, etc.). Likewise condons are avoided which contain the doublets TA or GC which 
have been reported to be infrequently-found codons in plants. 

Although the CaMV35S promoter is generally a high level constitutive promoter 

20 in most plant tissues, the expression level of genes driven the CaMV35S promoter is low in 
floral tissue relative to the levels seen in leaf tissue. Because the economically important tar- 
gets damaged by some insects are the floral parts or derived from floral parts (e.g., cotton 
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squares and bolls, tobacco buds, tomato buds and fruit), it is often advantageous to increase 
the expression of crystal proteins in these tissues over that obtained with the CaMV35S pro- 
moter. 

The 35S promoter of Figwort Mosaic Virus (FMV) is analogous to the CaMV35S 
5 promoter. This promoter has been isolated and engineered into a plant transformation vector. 
Relative to the CaMV promoter, the FMV 35S promoter is highly expressed in the floral tis- 
sue, while still providing similar high levels of gene expression in other tissues such as leaf. 
A plant transformation vector, may be constructed in which the full length synthetic cry gene 
is driven by the FMV 35S promoter. Tobacco plants may be transformed with the vector and 
10 compared for expression of the crystal protein by Western blot or ELISA immunoassay in 
leaf and floral tissue. The FMV promoter has been used to produce relatively high levels of 
crystal protein in floral tissue compared to the CaMV promoter. 

535 Example 35 Expression of Synthetic cry Genes with ssRUBISCO 

1 5 Promoters and Chloroplast Transit Peptides 

The genes in plants encoding the small subunit of RUBISCO (SSU) are often highly 
expressed, light regulated and sometimes show tissue specificity. These expression proper- 
ties are largely due to the promoter sequences of these genes. It has been possible to use SSU 
promoters to express heterologous genes in transformed plants. Typically a plant will con- 

20 tain multiple SSU genes, and the expression levels and tissue specificity of different SSU 
genes will be different. The SSU proteins are encoded in the nucleus and synthesized in the 
cytoplasm as precursors that contain an N-terminal extension known as the chloroplast transit 
peptide (CTP). The CTP directs the precursor to the chloroplast and promotes the uptake of 
the SSU protein into the chloroplast. In this process, the CTP is cleaved from the SSU pro- 

25 tein. These CTP sequences have been used to direct heterologous proteins into chloroplasts 
of transformed plants. 

The SSU promoters might have several advantages for expression of heterologous 
genes in plants. Some SSU promoters are very highly expressed and could give rise to ex- 
pression levels as high or higher than those observed with the CaMV35S promoter. The tis- 
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sue distribution of expression from SSU promoters is different from that of the CaMV35S 
promoter, so for control of some insect pests, it may be advantageous to direct the expression 
of crystal proteins to those cells in which SSU is most highly expressed. For example, al- 
though relatively constitutive, in the leaf the CaMV35S promoter is more highly expressed in 
5 vascular tissue than in some other parts of the leaf, while most SSU promoters are most 
highly expressed in the mesophyll cells of the leaf. Some SSU promoters also are more 
highly tissue specific, so it could be possible to utilize a specific SSU promoter to express the 
protein of the present invention in only a subset of plant tissues, if for example expression of 
such a protein in certain cells was found to be deleterious to those cells. For example, for 

1 0 control of Colorado potato beetle in potato, it may be advantageous to use SSU promoters to 
direct crystal protein expression to the leaves but not to the edible tubers. 

Utilizing SSU CTP sequences to localize crystal proteins to the chloroplast might 
also be advantageous. Localization of the B. thuringiensis crystal proteins to the chloroplast 
could protect these from proteases found in the cytoplasm. This could stabilize the proteins 

15 and lead to higher levels of accumulation of active toxin, cry genes containing the CTP 
could be used in combination with the SSU promoter or with other promoters such as 
CaMV35S. 

5.36 Example 36 - Targeting of Cry* Proteins to the Extracellular 

20 Space or Vacuole through the Use of Signal Peptides 

The B. thuringiensis proteins produced from the synthetic genes described here 
are localized to the cytoplasm of the plant cell, and this cytoplasmic localization results in 
plants that are insecticidally effective. It may be advantageous for some purposes to direct 
the B. thuringiensis proteins to other compartments of the plant cell. Localizing 
25 B. thuringiensis proteins in compartments other than the cytoplasm may result in less expo- 
sure of the B. thuringiensis proteins to cytoplasmic proteases leading to greater accumulation 
of the protein yielding enhanced insecticidal activity. Extracellular localization could lead to 
more efficient exposure of certain insects to the B. thuringiensis proteins leading to greater 
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efficacy. If a B. ihuringiensis protein were found to be deleterious to plant cell function, then 
localization to a noncytoplasmic compartment could protect these cells from the protein. 

In plants as well as other eukaryotes, proteins that are destined to be localized either 
extracellularly or in several specific compartments are typically synthesized with an N- 
5 terminal amino acid extension known as the signal peptide. This signal peptide directs the 
protein to enter the compartmentalization pathway, and it is typically cleaved from the ma- 
ture protein as an early step in compartmentalization. For an extracellular protein, the secre- 
tory pathway typically involves cotranslational insertion into the endoplasmic reticulum with 
cleavage of the signal peptide occurring at this stage. The mature protein then passes through 

10 the Golgi body into vesicles that fuse with the plasma membrane thus releasing the protein 
into the extracellular space. Proteins destined for other compartments follow a similar path- 
way. For example, proteins that are destined for the endoplasmic reticulum or the Golgi 
body follow this scheme, but they are specifically retained in the appropriate compartment. 
In plants, some proteins are also targeted to the vacuole, another membrane bound compart- 

15 ment in the cytoplasm of many plant cells. Vacuole targeted proteins diverge from the above 
pathway at the Golgi body where they enter vesicles that fuse with the vacuole. 

A common feature of this protein targeting is the signal peptide that initiates the com- 
partmentalization process. Fusing a signal peptide to a protein will in many cases lead to the 
targeting of that protein to the endoplasmic reticulum. The efficiency of this step may de- 

20 pend on the sequence of the mature protein itself as well. The signals that direct a protein to 
a specific compartment rather than to the extracellular space are not as clearly defined. It ap- 
pears that many of the signals that direct the protein to specific compartments are contained 
within the amino acid sequence of the mature protein. This has been shown for some vacuole 
targeted proteins, but it is not yet possible to define these sequences precisely. It appears that 

25 secretion into the extracellular space is the "default" pathway for a protein that contains a 
signal sequence but no other compartmentalization signals. Thus, a strategy to direct 
B. thuringiensis proteins out of the cytoplasm is to fuse the genes for synthetic 
B> thuringiensis genes to DNA sequences encoding known plant signal peptides. These fu- 
sion genes will give rise to B. thuringiensis proteins that enter the secretory pathway, and 

30 lead to extracellular secretion or targeting to the vacuole or other compartments. 
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Signal sequences for several plant genes have been described. One such sequence is for the 
tobacco pathogenesis related protein PR lb has been previously described (Comelissen et ai, 
1986). The PRlb protein is normally localized to the extracellular space. Another type of 
signal peptide is contained on seed storage proteins of legumes. These proteins are localized 
5 to the protein body of seeds, which is a vacuole like compartment found in seeds. A signal 
peptide DNA sequence for the P-subunit of the 7S storage protein of common bean 
(Phaseolus vulgaris), PvuB has been described (Doyle et al, 1986). Based on the published 
these published sequences, genes may be synthesized chemically using oligonucleotides that 
encode the signal peptides for PRlb and PvuB. In some cases to achieve secretion or com- 
10 partmentalization of heterologous proteins, it may be necessary to include some amino acid 
sequence beyond the normal cleavage site of the signal peptide. This may be necessary to 
insure proper cleavage of the signal peptide. 

5.37 Example 37 - Isolation of Transgenic Maize Resistant 

1 5 to diabrotica spp. using cry3bb variants 

5.37, 1 Plant Gene Construction 

The expression of a plant gene which exists in double-stranded DNA form in- 
volves transcription of messenger RNA (mRNA) from one strand of the DNA by RNA po- 
lymerase enzyme, and the subsequent processing of the mRNA primary transcript inside the 

20 nucleus. This processing involves a 3' non-translated region which adds polyadenylate nu- 
cleotides to the 3' end of the RNA. Transcription of DNA into mRNA is regulated by a re- 
gion of DNA usually referred to as the "promoter". The promoter region contains a sequence 
of bases that signals RNA polymerase to associate with the DNA and to initiate the tran- 
scription of mRNA using one of the DNA strands as a template to make a corresponding 

25 strand of RNA. 

A number of promoters which are active in plant cells have been described in the 
literature. Such promoters may be obtained from plants or plant viruses and include, but are 
not limited to, the nopaline synthase (NOS) and octopine synthase (OCS) promoters (which 
are carried on tumor-inducing plasmids of Agrobacterium tumefaciens), the cauliflower mo- 

30 saic virus (CaMV) 19S and 35S promoters, the light-inducible promoter from the small 
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subunit of ribulose 1,5-bisphosphate carboxylase (ssRUBISCO, a very abundant plant 
polypeptide), and the Figwort Mosaic Virus (FMV) 35S promoter. All of these promoters 
have been used to create various types of DNA constructs which have been expressed in 
plants (see e.g., U. S. Patent No. 5,463,175, specifically incorporated herein by reference). 
5 The particular promoter selected should be capable of causing sufficient expres- 

sion of the enzyme coding sequence to result in the production of an effective amount of 
protein. One set of preferred promoters are constitutive promoters such as the CaMV35S or 
FMV35S promoters that yield high levels of expression in most plant organs (U. S. Patent 
No. 5,378,619, specifically incorporated herein by reference). Another set of preferred pro- 

10 moters are root enhanced or specific promoters such as the CaMV derived 4 as-1 promoter or 
the wheat POX1 promoter (U. S. Patent No. 5,023,179, specifically incorporated herein by 
reference; Hertig et ai, 1991). The root enhanced or specific promoters would be particu- 
larly preferred for the control of corn rootworm (Diabroticus spp.) in transgenic corn plants. 

The promoters used in the DNA constructs (i.e. chimeric plant genes) of the pres- 

15 ent invention may be modified, if desired, to affect their control characteristics. For example, 
the CaMV35S promoter may be ligated to the portion of the ssRUBISCO gene that represses 
the expression of ssRUBISCO in the absence of light, to create a promoter which is active in 
leaves but not in roots. The resulting chimeric promoter may be used as described herein. 
For purposes of this description, the phrase "CaMV35S" promoter thus includes variations of 

20 CaMV35S promoter, e.g., promoters derived by means of ligation with operator regions, ran- 
dom or controlled mutagenesis, etc. Furthermore, the promoters may be altered to contain 
multiple "enhancer sequences" to assist in elevating gene expression. 

The RNA produced by a DNA construct of the present invention also contains a 5' 
non-translated leader sequence. This sequence can be derived from the promoter selected to 

25 express the gene, and can be specifically modified so as to increase translation of the mRNA. 
The 5' non-translated regions can also be obtained from viral RNA's, from suitable eucaryotic 
genes, or from a synthetic gene sequence. The present invention is not limited to constructs 
wherein the non-translated region is derived from the 5' non-translated sequence that accom- 
panies the promoter sequence. 
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For optimized expression in monocotyledenous plants such as maize, an intron 
should also be included in the DNA expression construct. This intron would typically be 
placed near the 5' end of the mRNA in untranslated sequence. This intron could be obtained 
from, but not limited to, a set of introns consisting of the maize hsp70 intron (U. S. Patent 
5 No. 5,424,412; specifically incorporated herein by reference) or the rice Act I intron 
(McElroy et al, 1990). As shown below, the maize hsp70 intron is useful in the present in- 
vention. 1 

As noted above, the 3' non- translated region of the chimeric plant genes of the 
present invention contains a polyadenylation signal which functions in plants to cause the 
10 addition of adenylate nucleotides to the 3' end of the RNA. Examples of preferred 3' regions 
are (1) the 3' transcribed, non-translated regions containing the polyadenylate signal of Agro- 
bacterium tumor-inducing (Ti) plasmid genes, such as the nopaline synthase (NOS) gene and 
(2) plant genes such as the pea ssRUBISCO E9 gene (Fischhoff et al, 1987). 

1 5 5.37.2 Plant Transformation and Expression 

A chimeric plant gene containing a structural coding sequence of the present in- 
vention can be inserted into the genome of a plant by any suitable method. Suitable plant 
transformation vectors include those derived from a Ti plasmid of Agrobacterium tumefaci- 
ens, as well as those disclosed, e.g., by Herrera-Estrella (1983), Bevan (1983), Klee (1985) 

20 and Eur. Pat. Appl. Publ. No. EP0120516. In addition to plant transformation vectors de- 
rived from the Ti or root-inducing (Ri) plasmids of Agrobacterium, alternative methods can 
be used to insert the DNA constructs of this invention into plant cells. Such methods may 
involve, for example, the use of liposomes, electroporation, chemicals that increase free DNA 
uptake, free DNA delivery via microprojectile bombardment, and transformation using vi- 

25 ruses or pollen (Fromm et al ,1986; Armstrong et al , 1 990; Fromm et al , 1 990). 

5.37.3 Construction of Monocot Plant Expression Vectors 
for cry3Bb Variants 
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5.37.3.1 Design Of Cry3bb Variant Genes For Plant Expression 

For efficient expression of the cry3Bb variants in transgenic plants, the gene en- 
coding the variants must have a suitable sequence composition (Diehn et ai, 1996). One ex- 
ample of such a sequence is shown for the vl 123 1 gene (SEQ ID NO:99) which encodes the 
5 Cry3Bbll231 variant protein (SEQ ID NO:100) with Diabrotica activity. This gene was 
derived via mutagenesis (Kunkel, 1985) of a crySBb synthetic gene (SEQ ID NO: 101) encod- 
ing a protein essentially homologous to the protein encoded by the native cry3Bb gene (Gen 
Bank Accession Number m89794, SEQ ID NO: 102). The following oligonucleotides were 
used in the mutagenesis of the original cry3Bb synthetic gene (SEQ ID NO: 101) to create the 
10 vl!231 gene(SEQIDNO:99): 
01igo#l: 

5'-TAGGCCTCCATCCATGGCAAACCCTAACAATC-3' (SEQ ID NO: 103) 
Oligo #2: 

5'-TCCCATCTTCCTACTTACGACCCTGCAGAAATACGGTCCAAC -3 r 
15 (SEQ ID NO: 104) 

Oligo #3: 

5'-GACCTCACCTACCAAACATTCGATCTTG -3' (SEQ ID NO:105) 
Oligo #4: 

5'-CGAGTTCTACCGTAGGCAGCTCAAG-3' (SEQ ID NO: 106) 

20 

5.37.3.2 Construction of Cry3Bb Monocot Plant Expression Vector 

To place the cry3Bb variant gene vl 1231 in a vector suitable for expression in 
rnonocotyledonous plants (i.e. under control of the enhanced Cauliflower Mosaic Virus 35S 
promoter and link to the hsp70 intron followed by a nopaline synthase polyadenylation site as 

25 in U. S. Patent No. 5,424,412, specifically incorporated herein by reference), the vector 
pMON 19469 was digested with Ncol and EcoRl. The larger vector band of approximately 
4.6 kb was electrophoresed, purified, and Hgated with T4 DNA ligase to the Ncol-EcoRl 
fragment of approximately 2 kb containing the vl 1231 gene (SEQ ID NO:99). The ligation 
mix was transformed into E. coli, carbenicillin resistant colonies recovered and plasmid DNA 

30 recovered by DNA miniprep procedures. This DNA was subjected to restriction endonucle- 
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ase analysis with enzymes such as Ncol and EcoRl (together), Notl, and Pstl to identify 
clones containing pMON33708 (the vl 1231 coding sequence fused to the hsp70 intron under 
control of the enhanced CaMV35S promoter). 

To place the vl 1231 gene in a vector suitable for recovery of stably transformed 
5 and insect resistant plants, the 3.75-kb Notl restriction fragment from pMON33708 contain- 
ing the lysine oxidase coding sequence fused to the hsp70 intron under control of the en- 
hanced CaMV35S promoter was isolated by gel electrophoresis and purification. This frag- 
ment was ligated with pMON30460 treated with Notl and calf intestinal alkaline phosphatase 
(pMON30460 contains the neomycin phosphotransferase coding sequence under control of 

10 the CaMV35S promoter). Kanamycin resistant colonies were obtained by transformation of 
this ligation mix into E. coli and colonies containing pMON33710 identified by restriction 
endonuclease digestion of plasmid miniprep DNAs. Restriction enzymes such as Notl, 
EcoKV, Hindlll, Ncol, EcoRl, and Bglll can be used to identify the appropriate clones con- 
taining the Notl fragment of pMON33708 in the Notl site of pMON30460 (i.e. pMON33710) 

1 5 in the orientation such that both genes are in tandem (i.e. the 3' end of the vl 123 1 expression 
cassette is linked to the 5' end of the nptll expression cassette). Expression of the vl 1231 
protein by pMON33710 in corn protoplasts was confirmed by electroporation of 
pMON33710 DNA into protoplasts followed by protein blot and ELISA analysis. This vec- 
tor can be introduced into the genomic DNA of corn embryos by particle gun bombardment 

20 followed by paromomycin selection to obtain corn plants expressing the vl 1231 gene essen- 
tially as described in U. S. Patent No. 5,424,412, specifically incorporated herein by refer- 
ence. 

In this example, the vector was introduced via cobombardment with a hygro- 
mycin resistance conferring plasmid into immature embryo scutella (IES) of maize, followed 
25 by hygromycin selection, and regeneration. Transgenic corn lines expressing the vl 123 1 
protein were identified by ELISA analysis. Progeny seed from these events were subse- 
quently tested for protection from Diabrotica feeding. 
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5.37.3.3 In planta Performance of Cry3Bb.1 1231 

Transformed corn plants expressing CrySBb.l 123 1 protein were challenged with 
western corn rootworm (WCR) larvae in both a seedling and 10 inch pot assay. The trans- 
formed genotype was A634, where the progeny of the R0 cross by A634 was evaluated. Ob- 
5 servations included effect on larval development (weight), root damage rating (RDR), and 
protein expression. The transformation vector containing the cry3Bb gene was pMON33710. 
Treatments included the positive and negative iso-populations for each event and an A634 
check. 

The seedling assay consisted of the following steps: (i) single seeds were placed 
10 in 1 oz cups containing potting soil; (ii) at spiking, each seedling was infested with 4 neonate 
larvae; and (iii) after infestation, seedlings were incubated for 7 days at 25°C, 50% RH, and 
14:10 (L:D) photo period. Adequate moisture was added to the potting soil during the incu- 
bation period to maintain seedling vigor. 

The 10 inch pot assay consisted of the following steps: (i) single seeds were 
15 placed in 10 inch pots containing potting soil; (ii) at 14 days post planting, each pot was in- 
fested with 800 eggs which have been pre-incubated such that hatch would occur 5-7 days 
post infestation; and (iii) after infestation, plants were incubated for 4 weeks under the same 
environmental conditions as the seedling assay. Pots were both sub and top irrigated daily. 

For the seedling assay, on day 7 plants were given a root damage rating, and sur- 
20 viving larvae were weighed. Also at this time, Cry3Bb protein concentrations in the roots 
were determined by ELISA. The scale used for the seedling assay to assess root damage is as 
follows: RDR (root damage rating) 0 = no visible feeding; RDR 1 = very light feeding; RDR 
2 = light feeding; RDR 3 = moderate feeding; RDR 4 = heavy feeding; and RDR 5 = very 
heavy feeding. 

25 Results of the seedling assay are shown in Table 26. Plants expressing Cry3Bb 

protein were completely protected by WCR feeding, where surviving larvae within this 
treatment had not grown. Mean larval weights ranged from 2.03-2.73 mg for the nonexpress- 
ing treatments, where the surviving larval average weight was 0.11 mg on the expressing 
cry3Bb treatment. Root damage ratings were 3.86 and 0.33 for the nonexpressing and ex- 
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pressing isopopulations, respectively. Larval survival ranged from 75-85% for the negative 
and check treatments, where only 25% of the larvae survived on the Cry3Bb treatment. 

Table 26 

5 Effect of Cry3Bb Expressing Plants on 

WCR Larvae in a Seedling Assay 



Plants Larvae 









Root 






% 


Mean±SD 


Event 


Treatment 


N 


(ppm) 


RDR+SD 


N 


Surv 


Wt. (mg) 


16 


Negative 


7 


0.0 


3.8610.65 


21 


75 


2.73±1.67 


16 


Positive 


3 


29.01 


0.33±0.45 


3 


25 


0.11+0.07 


A634 


Check 


4 


0.0 




13 


81 


2.03±0.83 



For the 1 0 inch pot assay, at 4 weeks post infestation plant height was recorded 
and a root damage rating (Iowa 1*6 scale; Hills and Peters, 1971) was given. 

10 Results of the 10 inch pot assay are shown in Table 27. Plants expressing Cry3Bb 

protein had significantly less feeding damage and were taller than the non-expressing plants. 
Event 16, the higher of the two expressing events provided nearly complete control. The 
negative treatments had very high root damage ratings indicating very high insect pressure. 
The positive mean root damage ratings were 3.4 and 2.2 for event 6 and 16, respectively. 

1 5 Mean RDR for the negative treatment was 5.0 and 5.6. 

Table 27 

Effect of Cry3bb Expressing Corn in Controlling 
Wcr Larval Feeding in a 10 Inch Pot Assay 

Root Plant 
Event Treatment N (ppm) RDR±SD Height (cm) 

6 Negative 7 O0 5.0±1.41 49.7±18.72 

6 Positive 5 7.0 3.4±1.14 73.9±8.67 
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16 Negative 5 0.0 5.6±0.89 61.2±7.75 

16 Positive 5 55.0 2.2+0.84 83.8±7.15 

In summary, corn plants expressing Cry3Bb protein have a significant biological 
effect on WCR larval development as seen in the seedling assay. When challenged with very 
high infestation levels, plants expressing the Cry3Bb protein were protected from WCR lar- 
5 val feeding damage as illustrated in the 10 inch pot assay. 

6.0 Brief Description of the Sequence Identifiers 

SEQ ID NO:l DNA sequence of cry3BbA 1221 gene. 

SEQ ID NO:2 Amino acid sequence of Cry3Bb.l 1221 polypeptide. 

10 SEQ ID NO:3 DNA sequence of cry3Bb. 11 222 gene. 

SEQ ID NO:4 Amino acid sequence of Cry3Bb.l 1222 polypeptide. 

SEQ ID NO:5 DNA sequence of cry3Bb. 11223 gene. 

SEQ ID NO:6 Amino acid sequence of Cry3Bb. 1 1 223 polypeptide. 

SEQ ID NO:7 DNA sequence of cry3BbA 1224 gene. 

15 SEQ ID NO:8 Amino acid sequence of Cry3Bb.l 1224 polypeptide. 

SEQ ID NO:9 DNA sequence of cry3Bb. 11 225 gene. 

SEQ ID NO: 10 Amino acid sequence of Cry3Bb. 1 1225 polypeptide. 

SEQ ID NO:l 1 DNA sequence of cry3BbA 1226 gene. 

SEQ ID NO:12 Amino acid sequence of Cry3Bb.l 1226 polypeptide. 

20 SEQ ID NO: 1 3 DNA sequence of cry3Bb. 11227 gene. 

SEQ ID NO: 14 Amino acid sequence of Cry3Bb.l 1227 polypeptide. 

SEQ ID NO:15 DNA sequence of cry 3Bb. 11 228 gene. 

SEQ ID NO: 16 Amino acid sequence of Cry3Bb. 1 1 228 polypeptide. 

SEQ ID NO: 17 DNA sequence of cry3Bb. 11229 gene. 

25 SEQ ID NO: 1 8 Amino acid sequence of Cry3Bb. 1 1 229 polypeptide. 

SEQ ID NO: 1 9 DNA sequence of cry3Bb. 11230 gene. 

SEQ ID NO:20 Amino acid sequence of Cry3Bb.l 1230 polypeptide. 
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SEQ ID N0:21 DNA sequence o(cry3Bb.l 1231 gene. 

SEQ ID NO:22 Amino acid sequence of Cry 3 Bb. 1 1 23 1 polypeptide. 

SEQ ID NO:23 DNA sequence of cry3Bb. 1 1232 gene. 

SEQ ID NO:24 Amino acid sequence of Cry3Bb. 1 1232 polypeptide. 

5 SEQ ID NO:25 DNA sequence of cry3Bb.U 233 gene. 

SEQ ID NO:26 Amino acid sequence of Cry3Bb.l 1233 polypeptide. 

SEQ ID NO:27 DNA sequence of cry3Bb. 11234gene. 

SEQ ID NO:28 Amino acid sequence of Cry3Bb. 1 1 234 polypeptide. 

SEQ ID NO:29 DNA sequence of cry3Bb.l 1235 gene. 

10 SEQ ID NO:30 Amino acid sequence of Cry3Bb. 1 1 235 polypeptide. 

SEQ ID NO:31 DNA sequence of cry3Bb. 11236 gene. 

SEQ ID NO:32 Amino acid sequence of Cry3Bb. 1 1236 polypeptide. 

SEQ ID NO:33 DNA sequence oicry3Bb.l 1237 gene. 

SEQ ID NO:34 Amino acid sequence of Cry3Bb.l 1237 polypeptide. 

15 SEQ ID NO:35 DNA sequence of cry3Bb. 11238 gene. 

SEQ ID NO:36 Amino acid sequence of Cry3Bb. 1 1 238 polypeptide. 

SEQ ID NO:37 DNA sequence of cry3Bb. 11239 gene. 

SEQ ID NO:38 Amino acid sequence of Cry3Bb. 1 1 239 polypeptide. 

SEQ ID NO:39 DNA sequence of cry3Bb.l 1241 gene. 

20 SEQ ID NO:40 Amino acid sequence of Cry3Bb. 1 1 24 1 polypeptide. 

SEQ ID NO:41 DNA sequence of cry3Bb. 11242 gene. 

SEQ ID NO:42 Amino acid sequence of Cry3Bb.l 1242 polypeptide. 

SEQ ID NO:43 DNA sequence of cry3Bb.l 1032 gene. 

SEQ ID NO:44 Amino acid sequence of Cry3Bb. 1 1 032 polypeptide. 

25 SEQ ID NO:45 DNA sequence of cry3Bb. 11035 gene. 

SEQ ID NO:46 Amino acid sequence of Cry3Bb. 11035 polypeptide. 

SEQ ID NO:47 DNA sequence of cry3Bb. 11036 gene. 

SEQ ID NO:48 Amino acid sequence of Cry3Bb. 1 1 036 polypeptide. 

SEQ ID NO:49 DNA sequence of cry 3 Bb.l 1046 gene. 

30 SEQ ID NO:50 Amino acid sequence of Cry3Bb. 1 1 046 polypeptide. 
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SEQ ID N0:5 1 DNA sequence ofcry3Bb.il 048 gene. 

SEQ ID NO:52 Amino acid sequence of Cry3Bb.l 1048 polypeptide. 

SEQ ID NO:53 DNA sequence oicryiBb. 11051 gene. 

SEQ ID NO:54 Amino acid sequence of Cry3Bb. 1 1 05 1 polypeptide. 

5 SEQ ID NO:55 DNA sequence of cry 3 Bb. 11057 gene. 

SEQ ID NO:56 Amino acid sequence of Cry3Bb. 1 1 057 polypeptide. 

SEQ ID NO:57 DNA sequence of crySBb.l 1058 gene. 

SEQ ID NO:58 Amino acid sequence of Cry3Bb.l 1058 polypeptide. 

SEQ ID NO:59 DNA sequence of cry 3 Bb.l 1081 gene. 

10 SEQIDNO:60 Amino acid sequence of Cry3Bb.l 1081 polypeptide. 

SEQ ID NO:61 DNA sequence of cry 3 Bb.l 1082 gene. 

SEQ ID NO:62 Amino acid sequence of Cry3Bb.l 1082 polypeptide. 

SEQ ID NO :63 DNA sequence of cry3Bb. 1 1 083 gene. 

SEQ ID NO:64 Amino acid sequence of Cry3Bb. 1 1083 polypeptide. 

1 5 SEQ ID NO:65 DNA sequence of cry3Bb. 11084 gene. 

SEQ ID NO:66 Amino acid sequence of Cry3Bb. 1 1084 polypeptide. 

SEQ ID NO:67 DNA sequence of cry3Bb. 11095 gene. 

SEQ ID NO:68 Amino acid sequence of Cry3Bb. 1 1 095 polypeptide. 

SEQ ID NO:69 DNA sequence of cry3Bb. 60 gene. 

20 SEQ ID NO:70 Amino acid sequence of Cry3Bb.60 polypeptide. 

SEQ ID NO-.71 Primer FW001 . 

SEQ ID NO:72 Primer FW006. 

SEQ ID NO:73 Primer MVT095. 

SEQ ID NO:74 Primer MVT097. 

25 SEQ ID NO:75 Primer MVT091 . 

SEQ ID NO:76 Primer MVT075. 

SEQIDNO:77 Primer MVT076. 

SEQ ID NO:78 Primer MVT1 1 1 . 

SEQIDNO:79 Primer MVT094. 

30 SEQIDNO.-80 Primer MVT103. 
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SEQ 1DN0:81 


Primer MVT081. 




SEQ IDN0:82 


Primer MVT085. 




SEQ ID NO:83 


Primer A. 




SEQ ID NO:84 


Primer B. 


5 


SEQ IDNO:85 


Primer C. 




SEQ ID NO:86 


Primer D. 




SEQ ID NO:87 


Primer E. 




SEQ IDNO:88 


Primer F. 




SEQ ID NO:89 


Primer G. 


10 


SEQ IDNO:90 


Primer WD1 12. 




SEQ ID NO:91 


Primer WD 11 5. 




SEQ ID NO:92 


Primer MVT105. 




SEQ IDNO:93 


Primer MVT092. 




SEQ ID NO:94 


Primer MVT070. 


15 


SEQ IDNO:95 


Primer MVT083. 




SEQ ID NO:96 


N-terminal amino acid of Cry3Bb polypeptide. 




SEQ ID NO:97 


DNA sequence of wild-type cry3Bb gene. 




SEQIDNO:98 


Amino acid sequence of wild-type Cry3Bb polypeptide. 




SEQ IDNO:99 


Plantized DNA sequence for cry3BbJ 1231 gene. 


20 


SEQ ID NO: 100 


Amino acid sequence of plantized Cry3Bb.l 1231 polypeptide. 




SEQ ID NO: 101 


DNA sequence of cry 3 Bb gene used to prepare SEQ ID NO:99. 




SEQ ID NO: 102 


DNA sequence of wild-type cry3Bb gene, Genbank #M89794. 




SEQ ID NO: 103 


DNA sequence of Oligo #L 




SEQ ID NO: 104 


DNA sequence of Oligo #2. 


25 


SEO ID NO- 105 


DNA sequence of Oligo #3 . 




SEQ ID NO: 106 


DNA sequence of Oligo #4. 




SEQ ID NO: 107 


DNA sequence of cry3Bb. 11098 gene. 




SEQ ID NO:108 


Amino acid sequence of Cry3Bb.l 1098 polypeptide. 


30 7.0 


References 





A U5535CWKVOn.DOC) 



• 194- 



The following references, to the extent that they provide exemplary procedural or 
other details supplementary to those set forth herein, are specifically incorporated herein by 
reference. 

U. S. Patent 4,237,224, issued December 2, 1980. 
5 U. S. Patent 4,332,898, issued Jun. 1, 1982. 

U. S. Patent 4,342,832, issued Aug. 3, 1 982. 

U. S. Patent 4,356,270, issued Oct. 26, 1 982. 

U. S. Patent 4,362,817, issued Dec. 7, 1982. 

U. S. Patent 4,371,625, issued Feb. 1, 1983. 
10 U. S. Patent 4,448,885, issued May 15, 1984. 

U. S. Patent 4,467,036, issued Aug. 21, 1984. 

U. S. Patent 4,554,101, issued Nov. 19, 1985. 

U. S. Patent 4,683,195, issued Jul. 28, 1987. 

U. S. Patent 4,683,202, issued Jul. 28, 1987. 
15 U. S. Patent 4,757,01 1, issued Jul. 12, 1988. 

U. S. Patent 4,766,203, issued August 23, 1988. 

U. S. Patent 4,769,061, issued September 6, 1988. 
' U. S. Patent 4,797,279, issued January 10, 1989. 

U. S. Patent 4,800,159, issued January 24, 1989. 
20 U. S. Patent 4,883,750, issued November 28, 1989. 

U. S. Patent 4,910,016, issued March 20, 1990. 

U. S. Patent 4,940,835, issued Feb. 23, 1990. 

U. S. Patent 4,965, 188, issued Oct. 23, 1990. 

U. S. Patent 4,971,908, issued Nov. 20, 1990. 
25 U. S. Patent 4,987,071, issued Jan. 22, 1991. 

U. S. Patent 5, 380, 831, issued Jan. 10, 1995. 

U. S. Patent 5,023,179, issued June 11, 1991. 
> U. S. Patent 5,024,837, issued June 18, 1991 . 

U. S. Patent 5,126,133, issued June 30, 1992. 
30 U. S. Patent 5,176,995, issued Oct. 15, 1991 . 



A: I35535(2W|CVOI!.DOC) 



-195- 



U. S. Patent 5,187,091, issued XXXXX, 1993. 

U. S. Patent 5,322,687, issued Jun'21, 1994. 

U. S. Patent 5,334,71 1, issued Aug. 2, 1994. 

U. S. Patent 5,378,619, issued January 3, 1995. 
5 U. S. Patent 5,424,412, issued June 13, 1995. 

U. S. Patent 5,441,884, issued Aug. 15, 1995. 

U. S. Patent 5,463,175, issued October 31, 1995. 

U. S. Patent 5,500,365, issued Mar 19, 1996. 

U. S. Patent 5,591,616, issued January 7, 1997. 
10 U. S. Patent 5,631,359, issued May 20, 1997. 

U. S. Patent 5,659,123, issued August 19, 1997. 

Eur. Pat. No. EP 0120516. 

Eur. Pat. No. EP 0360257. 

Eur. Pat. Appl. No. 921 10298.4. 
1 5 Eur. Pat. Appl. No. 295 1 56A1 . 

Great Britain Patent 2202328. 

Int. Pat. Appl. Publ. No. WO 91/03162. 

Int. Pat. Appl. Publ. No. WO 92/07065. 

Int. Pat. Appl. Publ. No. WO 93/15187. 
20 Int. Pat. Appl. Publ. No. WO 93/23569. 

Int. Pat. Appl. Publ. No. WO 94/02595. 

Int. Pat. Appl. Publ. No. WO 94/13688. 

Intl. Pat. Appl. Publ. No. PCT/US87/00880. 

Intl. Pat. Appl. Publ. No. PCT/US89/01025. 
25 Intl. Pat. Appl. Publ. No. WO 88/098 1 2. 

Intl. Pat. Appl. Publ. No. WO 88/10315. 

Intl. Pat. Appl. Publ. No. WO 89/06700. 

Intl. Pat. Appl. Publ. No. WO 93/07278. 

Abbott, "A method for computing the effectiveness of an insecticide," J. Econ. Entomol. 
30 18:265-267, 1925. 



A: I3553S<2WKVOMDOC) 



-196- 



I 

Abdullah etal, Biotechnology \ 4:1087, 1986. 
Almond and Dean, Biochemistry, 32:1040-1046, 1993. 
An et al , EMBO J. , 4:277-287, 1985. 

Angsuthanasamnbat al, FEMS Microbiol Lett,, 111:255-262, 1993. 
5 Armstrong et al, Plant Cell Rep., 9:335-339, 1990. 

Aronson, Wu, Zhang, "Mutagenesis of specificity and toxicity regions of a Bacillus 

thuringiensis protoxin gene." J. BacterioL, 177:4059-4065, 1995. 
Bagdasarian et ai, Gene, 16:237, 1981. 
Baum et al,Appl Environ. Microbiol, 56:3420-3428, 1990. 
10 Baum, "Tn540I 9 a new class II transposable element from Bacillus thuringiensis, " J. Bacte- 
riol, 176:2835-2845, 1994. 
Baum, J. BacterioL, 177:4036-4042, 1995. 

Baum, Kakefuda, Gawron-Burke, "Engineering Bacillus thuringiensis Bioinsecticides with 
an Indigenous Site-Specific Recombination System," Appl Environ. Microbiol, 
15 62:XXX-XXX, 1996. 

Benbrook et al, In: Proceedings Bio Expo J 986, Butterworth, Stoneham, MA, pp. 27-54, 
1986. 

Bevan et al, Nature, 304:184, 1983. 
Bolivar et aL, Gene, 2:95, 1977. 
20 Branden and Tooze, "Introduction to Protein Structure," Garland Publishing, Inc. ,New 
York, NY, 1991. 

Brussock and Currier, "Use of sodium dodecyl sulfate-polacryamide gel electrophoresis to 
quantify Bacillus thuringiensis 8-endotoxins," In: "Analytical Chemistry of Bacil- 
lus thuringiensis" L.A. Hickle and W.L. Fitch, (Eds), American Chemical Soci- 
25 ety, Washington D. C. s pp. 78-87, 1990. 

Capecchi, "High efficiency transformation by direct microinjection of DNA into cultured 
mammalian cells," Cell 22(2):479-488, 1980. 

Caramon, Albertini, Galizzi, "In vivo generation of hybrids between two Bacillus 
thuringiensis insect-toxin-encoding genes," Gene, 98:37-44, 1991. 



A I3553S(2WKVOI!.DOO 



-197- 



Cashmore et al.,Gen. Eng. of Plants, Plenum Press, New York, 29-38, 1983. 

Chambers et al., Appl. Environ. Microbiol., 173:3966-3976, 1991. 

Chau et al., Science, 244:174-181, 1989. 

Chen et al., Nucl. Acids Res., 20:4581-9, 1992. 
5 Chen, Curtiss, Alcantara, Dean, "Mutations in domain I of Bacillus thuringiensis 6-endotoxin 
CrylAb reduce the irreversible binding of toxin to Manduca sexta brush border 
membrane vesicles," J. Biol. Chem., 270:6412-6419, 1995. 

Chen, Lee, Dean, "Site-directed mutations in a highly conserved region of Bacillus 
thuringiensis 6-endotoxin affect inhibition of short circuit current across Bombyx 
10 mori midguts," Proc. Natl Acad. Sci. USA, 90:9041-9045, 1993. 

Chowrira and Burke, Nucl. Acids Res., 20:2835-2840, 1992. 

Clapp, "Somatic gene therapy into hematopoietic cells. Current status and future 

implications," Clin. Perinatoi, 20(1): 155-168, 1993. 
Cody, Luft, Jensen, Pangbom English, "Purification and crystallization of insecticidal 5- 
15 endotoxin CryIIIB2 from Bacillus thuringiensis" Proteins: Struct. Fund. Genet., 

14:324, 1992. 
Collins and Olive, Biochem., 32:2795-2799, 1993. 

Conway and Wickens, In: RNA Processing, p. 40, Cold Spring Harbor Laboratory, Cold 
Spring Harbor, NY, 1988. 
20 Cornelissen et al., "A tobacco mosaic virus-induced tobacco protein is homologous to the 
sweet-tasting protein thaumatin," Nature, 32 1(6069):53 1-532, 1986. 
Cramer, Cohen, Merrill, Song, "Structure and dynamics of the colicin El channel," Molec. 

Microbiol, 4:519-526, 1990. 
CRC Handbook of Chemistry and Physics, 58 th edition, CRC Press, Inc., Cleveland, Ohio, p. 
25 C-769, 1977. 

Cristou et al., Plant Physiol, 87:671-674, 1988. 

Curiel, Agarwal, Wagner, Cotten, "Adenovirus enhancement of transferrin-polylysine- 
mediated gene delivery," Proc. Natl. Acad. Sci. USA, 88(19):8850-8854, 1991. 



-198- 

A: IJ5S35<2WKVOIi DOC) 



Curiel, Wagner, Cotten, Bimstiel, Agarwal, Li, Loechel, Hu, "High-efficiency gene transfer 
mediated by adenovirus coupled to DNA-polylysine complexes," Hum. Gen. 
Ther., 3(2):147-154, 1992. 

Daum, "Revision of two computer programs for probit analysis," Bull. Entomol. Soc. Amer., 
5 16:10-15,1970. 

De Maagd, Kwa, van der Klei, Yamamoto, Schipper, Vlak, Stiekema, Bosch, "Domain III 
substitution in Bacillus thuringiensis delta-endotoxin CrylA(b) results in superior 
toxicity for Spodoptera exigua and altered membrane protein recognition," Appl. 
Environ. Microbiol., 62:1537-1543, 1996. 
1 0 Dean et al. , Nucl. Acids Res. , 1 4(5):2229, 1 986. 

Dhir el al., Plant Cell Reports, 10:97, 1991. 

Diehne/o/., Genet. Engineer., 18:83-99, 1996. 

Donovan, Dankocsik, Gilbert, Groat, Gawron-Burke, Carlton, "The P2 protein of Bacillus 
thuringiensis var. kurstaki: nucleotide sequence and entomocidal activity," J. Biol. 
15 Chem., 263:561-567, 1988. 

Doyle et al.,J. Biol. Chem., 261 (20):9228-9236, 1986. 
Dropulicera/.,J. Virol., 66:1432-41, 1992. 

Dunitz, "The entropic cost of bound water in crystals and biomolecules," Science, 264:670- 
68x, 1994. 

20 Earp and Ellar, Nucl. Acids Res. , 15:3619, 1 987. 

Eglitis and Anderson, "Retroviral vectors for introduction of genes into mammalian cells," 

Biotechniques, 6(7):608-614 s 1988. 
Eglitis, Kantoff, Kohn, Karson, Moen, Lothrop, Blaese, Anderson, "Retroviral-mediated gene 
transfer into hemopoietic cells," Adv. Exp. Med. Biol, 241:19-27, 1988. 
25 Elroy-Stein and Moss, Proc. Natl. Acad. Sci. USA, 87:6743-7, 1990. 
English and Slatin, Insect Biochem. Mol. Biol., 22:1-7, 1992. 

English, Readdy, Bastian, "Delta-endotoxin-induced leakage of 86 Rb*-K + and H 2 0 from 
phospholipid vesicles is catalyzed by reconstituted midgut membrane," Insect 
Biochem., 21:177-184, 1991. 



A. US»S(2WKV01!DOQ 



-199- 



Fischhoff et ai, Bio/Technology, 5:807-813, 1987. 

Fraley et al, Bio/Technology, 3:629-635, 1985. 

Fraley etal, Proc. Natl. Acad. Sci. USA, 80:4803, 1983. 

Frohman, PCR™ Protocols, a Guide to Methods and Applications XVIII Ed., Academic 
5 Press, New York, 1990. 

Fromm et al, Bio/Technology, 8:833-839, 1990. 
Fromm et al, Nature, 319:791-793, 1986. 

Fromm, Taylor, Walbot, "Expression of genes transferred into monocot and dicot plant cells 

by electroporation," Proc. Natl. Acad Sci. USA, 82(17):5824-5828, 1985. 
10 Fujimura et al, Plant Tissue Cult. Lett., 2:74, 1985. 

Fynan, Webster, Fuller, Haynes, Santoro, Robinson, "DNA vaccines: protective 

immunizations by parenteral, mucosal, and gene gun inoculations," Proc. Natl. 

Acad. Sci USA, 90(24): 1 1478-1 1482, 1993. 
Galitsky, Cody, Wojtczak, Ghosh, Luft, Pangborn, Wawrzak, English, "Crystal and Molecu- 
15 lar Structure of the Insecticidal Bacterial 6-Endotoxin CryIIIB2 of Bacillus 

thuringiensis," Research Communication to Ecogen Inc., Langhorne, PA, 1993. 
Gao and Huang, Nucl. Acids Res., 21:2867-72, 1993. 

Gazit and Shai, "Structural and Functional Characterization of the a-5 segment of Bacillus 
thuringiensis 5-endotoxin," Biochemistry, 32:3429-3436, 1993. 
20 Gazit and Shai, "The assembly and organization of the ct5 and a7 helices from the pore- 
forming domain of Bacillus thuringiensis 5-endotoxin," J. Biol. Chem., 270:2571- 
2578, 1995. 

Ge, Rivers, Milne, Dean, "Functional domains of Bacillus thuringiensis insecticidal crystal 
proteins: refinement of Heliothis virescens and Trichoplusia ni specificity do- 
25 mains on CrylA(c)," J. Biol. Chem., 266: 1 7954- 1 7958, 1 99 1 . 

Genovese and Milcarek, In: RNA Processing, p. 62, Cold Spring Harbor Laboratory, Cold 
Spring Harbor, NY, 1988. 

Gil and Proudfoot, Nature, 3 12:473, 1 984. 

Gonzalez Jr. etal., Proc. Natl. Acad Sci USA, 79:6951-6955, 1982. 

-200- 

A 1 35 53 5(2 WKV0U.DOC) 



Graham and van der Eb, "Transformation of rat cells by DNA of human adenovirus 5," 

Virology, 54(2):536-539, 1973. 
Grochulski, Masson, Borisova, Pusztai -Carey, Schwartz, Brousseau, Cygler, "Bacillus 
thuringiensis CrylA(a) insecticidal toxin: crystal structure and channel forma- 
5 tion," J. Mol. Biol., 254:447-464, 1 995. 

Guerrier-Takada e< al, Cell, 35:849, 1983. 
Hampel and Tritz, Biochem., 28:4929, 1989. 
Hampel et al.,Nucl. Acids Res., 18:299, 1990. 

Harlow and Lane, "Antibodies: A Laboratory Manual," Cold Spring Harbor Laboratory, Cold 
10 Spring Harbor, NY, 1 988. 

Herrera-Estrella et al. , Nature, 303 :209, 1 983. 

Hertel etal.,Nucl. Acids Res., 20:3252, 1992. 

Hertig et al. , Plant Mol. Biol. , 1 6: 1 7 1 - 1 74, 1 99 1 . 

Hess, Intern Rev. Cytol, 107:367, 1987. 
1 5 Hills and Peters, J. Econ. Entomol. , 64:764-765, 197 1 . 

Hockema, In: The Binary Plant Vector System, Offset-durkkerij, Kanters B.V., Alblasserdam, 
Chapter 5. 

H6fte and Whitely, Microbiol. Rev., 53:242-255, 1989. 
Holland et al., Biochemistry, 17:4900, 1978. 
20 Holsters et al. , Mol. Gen. Genet. , 1 63 : 1 8 1 - 1 87, 1 978. 

Honee, van der Salm, Visser, Nucl. Acids Res., 16:6240, 1988. 
Horsch et al. , Science, 227: 1 229- 1231,1985. 

Humason, In: Animal Tissue Techniques, W.H. Freeman and Company, 1967. 

Jaeger et al, Proc. Natl. Acad. Sci. USA, 86: 7706-7710, 1989. 
25 Johnston and Tang, "Gene gun transfection of animal cells and genetic immunization," 
Methods Cell. Biol., 43(A):353-365, 1994. 

Jorgensenera/., M>/. Gen. Genet., 207 :471, 1987. 

Kaiser and Kezdy, Science, 223:249-255, 1984. 

Kashani-Saber et al. , Antisense Res. Dev., 2:3-15, 1992. 
30 Keller etal.,EMBO J., 8:1309-14, 1989. 



A: U553S<2WK VOt! DOC) 



-201- 



Kiee etai, Bio/Technology, 3:637-642, 1985, 
Klein et al, Nature, 327:70, 1987. 

Klein et al, Proc. Natl Acad Set USA, 85:8502-8505, 1988. 
Kozak, Nature, 308:241-246, 1984. 
5 Krieg et ai, Anzeiger fur Schadlingskunde Pflanzenschutz Umweltschutz, 57:145-150, 1984. 
Krieg et al, Z angEnt., 96:500-508, 1983. 

Kuby, Immunology 2nd Edition, W. H. Freeman & Company, NY, 1 994 
Kunkle, "Rapid and efficient site-specific mutagenesis without phenotypic selection," Proc. 
Natl Acad. Sci. USA, 82:488-492, 1985. 
10 Kunkle, Roberts, Zabour, Methods Enzymol, 154:367-382, 1987. 

Kwak, Lu, Dean, "Exploration of receptor binding of Bacillus thuringiensis toxins," Mem. 

Inst Oswaldo, 90:75-79, 1995. 
Kwoh et ai, Proc Natl Acad Sci. USA, 86(4): 1 173-1 177, 1989. 
Kyteand Doolittle, J. Mol Biol, 157:105-132, 1982. 
15 L'Huillier et aL, EMBOJ., 11:4411-8, 1992. 

LaBean and Kauffman, "Design of synthetic gene libraries encoding random sequence pro- 
teins with desired ensemble characteristics," Prot. Sci, 2:1249-1254, 1993. 
Lambert, Buysse, Decode, Jansens, Piens, Saey, Seurinck, Van Audenhove, Van Rie, Van 
Vliet, Peferoen, "A Bacillus thuringiensis insecticidal crystal protein with a high 
20 activity against members of the family Noctuidae," Appl Environ. Microbiol, 

62:80-86, 1996. 

Lee, Milne, Ge, Dean, "Location of a Bombyx mori receptor binding region on a Bacillus 
thuringiensis 5-endotoxin," J. Biol Chern., 267:3115-3121, 1992. 

Lee, Young, Dean, "Domain III exchanges of Bacillus thuringiensis CrylA toxins affect 
25 binding to different gypsy moth midgut receptors," Biochem. Biophys. Res. Com- 

mun., 216:306-312, 1995. 

Li, Carroll, Ellar, "Crystal structure of insecticidal 5-endotoxin from Bacillus thuringiensis at 
2.5A resolution," Nature (London), 353:815-821, 1991. 

Lieber et al, Methods Enzymol, 217:47-66, 1993. 

-202- 

A. l3S535(2WKVOl! DOC) 



Lindstrom et ai, Developmental Genetics, 11:1 60. 1990. 
Lisziewicz et ai, Proc. Natl. Acad. Sci. U.S.A.. 90:8000-4, 1993. 
Lorzer ai, Mol. Gen. Genet., 199:178, 1985. 

Lu, Rajamohan, Dean, "Identification of amino acid residues of Bacillus thuringiensis 8- 
5 endotoxin CrylAa associated with membrane binding and toxicity to Bombyx 

mori,"J. Bacteriol, 176:5554-5559, 1994. 
Lu, Xiao, Clapp, Li, Broxmeyer, "High efficiency retroviral mediated gene transduction into 
single isolated immature and replatable CD34(3+) hematopoietic stem/progenitor 
cells from human umbilical cord blood," J. Exp. Med, 1 78(6):2089-2096, 1993. 
1 0 Macaluso and Mettus, J. Bacteriol. , 1 73 : 1 3 53 - 1 3 56, 1 99 1 . 

Maddock et ai, Third International Congress of Plant Molecular Biology, Abstract 372, 
1991. 

Maloy et ai, "Microbial Genetics" 2nd Edition. Jones and Bartlett Publishers, Boston, MA, 
1994. 

15 Maloy, "Experimental Techniques in Bacterial Genetics" Jones and Bartlett Publishers, 
Boston, MA, 1990. 

Maniatis, Fritsch, Sambrook, In: Molecular Cloning: A Laboratory Manual, Cold Spring 

Harbor Laboratory, Cold Spring Harbor, NY, 1982. 
Marcotte etai, Nature, 335:454, 1988. 
20 McDevitt et al, Cell, 37:993-999, 1984. 

McElroy et ai, Plant Cell, 2:163-171, 1990. 

Mettus and Macaluso, Appl. Environ. Microbiol, 56:1 128-1 134, 1990. 
Michael, "Mutagenesis by Incorporation of a Phosphorylated Oligo During PCR™ Amplifi- 
cation," BioTechniques, 16(3):410-412, 1994. 
25 Neuhaus et al. , Theor. Appl. Genet. , 75:30, 1 987. 
Odell et ai, Nature, 313:810, 1985. 

Ohara era/., Proc. Natl. Acad. Sci. USA, 86(15):5673-5677, 1989. 
Ohkawa et al. , Nucl. Acids Symp. Ser. , 27: 1 5-6, 1 992. 
Ojwang et ai, Proc. Natl. Acad. Sci. USA, 89:10802-6, 1992. 

-203- 

A: I3SS3X2WKVOP DOC) 



Olsons al, J. Bacteriol, 150:6069, 1982. 

Omirulleh et al, Plant Molecular Biology, 21:415-428, 1993. 

Pandey and Marzluff, In "RNA Processing," p. 133, Cold Spring Harbor Laboratory, Cold 
Spring Harbor, NY, 1987. 
5 Pena et al , Nature, 325 :274, 1 987. 
Perrault et al, Nature, 344:565, 1990. 
Perrotta and Been, Biochem., 31:16, 1992. 
Pieken et al. , Science, 253:314,1 99 1 . 
Poszkowski et al, EMBOJ., 3:2719, 1989. 
10 Potrykus et al, Mol Gen. Genet, 199:183, 1985. 
Poulsene/a/.,M>/. Gen. Genet, 205:193-200, 1986. 

Prokop and Bajpai, "Recombinant DNA Technology I," Ann. N V. Acad Sci.. 646:xxx-xxx, 
1991. 

Rajamohan, Alcantara, Lee, Chen, Curtiss, Dean, "Single amino acid changes in domain II of 
15 Bacillus thuringiensis CrylAb 5-endotoxin affect irreversible binding to Manduca 

sexta midgut membrane vesicles," J. Bacteriol, 177:2276-2282, 1995. 

Rajamohan, Cotrill, Gould, Dean, "Role of domain II, loop 2 residues of Bacillus 
thuringiensis CrylAb 5-endotoxin in reversible and irreversible binding to Man- 
duca sexta and Heliothis virescens" J. Biol. Chem., 271 :2390-2397, 1996. 
20 Rogers et al. , In: Methods For Plant Molecular Biology, A. Weissbach and H. Weissbach, 
eds., Academic Press Inc., San Diego, CA 1988. 

Rogers et al, Methods Enzymol, 153:253-277, 1987. 

Rossi et al,Aids Res. Hum. Retrovir., 8:183, 1992. 

SadofskyandAlwine,M>/ec. Cell Biol, 4(8): 1460-1468, 1984. 
25 Sambrook et al, "Molecular Cloning: A Laboratory Manual," Cold Spring Harbor Labora- 
tory, Cold Spring Harbor, NY, 1989. 

Sanchis, Lereclus, Menou, Chaufaux, Guo, Lecadet, Mol Microbiol, 3:229-238, 1989. 

Sanchis, Lereclus, Menou, Chaufaux, Lecadet, Mol. Microbiol, 2:393-404, 1988. 

Sarver et al, Science, 247:1222-1225, 1990. 

-204- 

A 135SJ5(2WKV0l!.DOC) 



Saville and Collins, Cell, 61 :685-696, 1990. 

Saville and Collins, Proc. Natl. Acad. Sci. USA, 88:8826-8830, 1991. 
Scanlon et al.,Proc. Natl. Acad. Sci. USA, 88:10591-5, 1991. 
Scaringe et al, Nucl. Acids Res., 18:5433-5441, 1990. 
5 Schnepf and Whitely, Proc. Natl. Acad Sci. USA, 78:2893-2897, 1981. 
Schnepfera/.,J. Biol. Chem., 260:6264-6272, 1985. 

Segal, "Biochemical Calculations" 2nd Edition, John Wiley & Sons, New York, 1976. 
Shaw and Kamen, Cell, 46:659-667, 1986. 

Shaw and Kamen, In: "RNA Processing", p. 220, Cold Spring Harbor Laboratory, Cold 
10 Spring Harbor, NY, 1987. 

Simpson, Science, 233:34, 1986. 

Slaney, Robbins, English, "Mode of action of Bacillus thuringiensis toxin CrylllA: An 
analysis of toxicity in Leptinotarsa decemlineata (Say) and Diabrotica undecim- 
punctata howardi Barber," Insect Biochem. Molec. Biol., 22:9-18, 1992. 

15 Slatin, Abrams, English, "Delta-endotoxins form cation-selective channels in planar lipid bi- 
layers," Biochem. Biophys. Res. Comm., 169(2):765-772, 1990. 
Smedley and Ellar, "Mutagenesis of three surface-exposed loops of a Bacillus thuringiensis 
insecticidal toxin reveals residues important for toxicity, receptor recognition and 
possibly membrane insertion," Microbiology, 142:1617-1624, 1996. 

20 Smith and Ellar, "Mutagenesis of two surface-exposed loops of the Bacillus thuringiensis 
CrylC 8-endotoxin affects insecticidal specificity," Biochem. J., 302:611-616, 
1994. 

Smith, Merrick, Bone, Ellar, Appl. Environ. Microbiol., 62:680-684, 1996. 

Spielmanne/a/., Mol. Gen. Genet., 205:34, 1986. 
25 Stemmer and Morris, "Enzymatic Inverse PCR™: A Restriction Site Independent, Single- 
Fragment Method for High-Efficiency, Site-Directed Mutagenesis," BioTech- 
niques, 13(2):2 14-220, 1992. 

Stemmer, Proc. Natl. Acad. Sci. USA, 91:10747-1075, 1994. 

Taira et al., Nucl. Acids Res., 19:5125-30, 1991. 

-205- 

A: IJ5535(2WKVOI!.DOC) 



Tomic etai, Nucl. Acids Res., 12:1656. 1990. 

Tomic, Sunjevaric, Savtchenko, Blumenberg, "A rapid and simple method for introducing 
specific mutations into any position of DNA leaving all other positions unaltered," 
Nucleic Acids Res., 18(6): 1 656, 1990. 
5 Toriyama et al. , Theor Appl. Genet. ,73:16, 1986. 

Uchimiyaera/., Mol. Gen. Genet., 204:204, 1986. 

Upenderef al, Biotechniques, 18:29-31, 1995. 

Usman and Cedergren, TIBS, 17:34, 1992. 

Usman and Cedergren, Trends in Biochem. Sci., 17:334, 1992. 
10 Usman et al.,J. Am. Chem. Soc, 109:7845-7854, 1987. 

Vallette, Merge, Reiss, Adesnik, "Construction of mutant and chimeric genes using the po- 
lymerase chain reaction," Nucl. Acids Res., 17:723-733, 1989. 

Vasil etai, "Herbicide-resistant fertile transgenic wheat plants obtained by microprojectile 
bombardment of regenerable embryogenic callus," Biotechnology, 10:667-674, 
15 1992. 

Vasil, Biotechnology, 6:397, 1988. 

Velten and Schell, Nucl. Acids Res., 13:6981-6998, 1985. 

Velten et al., EMBOJ., 3:2723-2730, 1984. 

Ventura et al., Nucl. Acids Res., 21:3249-55, 1993. 
20 Vodkin et al., Cell, 34:1023, 1983. 

Vogel etai, J. Cell Biochem., Suppl. 13D:312, 1989. 

Von Tersch, Slatin, Kulesza, English, "Membrane permeabilizing activity of Bacillus 
thuringiensis Coleopteran-active toxins CryIIIB2 and CryIIIB2 domain 1 pep- 
tides," Appl Env Microbiol, 60:371 1-3717, 1994. 
25 Wagner, Zatloukal, Cotten, Kirlappos, Mechtler, Curiel, Birnstiel, "Coupling of adenovirus 
to transferrin-polylysine/DNA complexes greatly enhances receptor-mediated 
gene delivery and expression of transfected genes," Proc. Natl. Acad. Sci. USA, 
89(13):6099-6103, 1992. 
Walker etai, Proc. Natl Acad. Sci. USA, 89(l):392-396, 1992. 
30 Walters et al , Biochem. Biophys. Res. Commun. , 1 96:92 1 -926, 1 993 . 

-206- 

A: IJS5JS(2WKV0I! DOC) 



Watson et al., Molecular Biology of the Gene, 4th Ed., W. A. Benjamin, Inc., Menlo Park, 
CA, 1987. 

Weerasinghee/o/.,/. Virol., 65:5531-4, 1991. 

Weissbach and Weissbach, Methods for Plant Molecular Biology, (eds.), Academic Press, 
5 Inc., San Diego, CA, 1988. 

Wenzlerera/., Plant Mol. Biol., 12:41-50, 1989. 
Wickens and Stephenson, Science, 226:1045, 1984. 

Wickens et al., In: "RNA Processing," p. 9, Cold Spring Harbor Laboratory, Cold Spring 

Harbor, NY, 1987. 
1 0 Wolfersberger et al. , Appl. Environ. Microbiol. , 62:279-282, 1 996. 

Wong and Neumann, "Electric field mediated gene transfer," Biochim. Biophys. Res. 

Commun., 107(2):584-587, 1982. 
Woolf <?/ al.,Proc. Natl. Acad. Sci. USA, 89:7305-7309, 1992. 

Wu and Aronson, "Localized mutagenesis defines regions of the Bacillus thuringiensis 5- 
15 endotoxin involved in toxicity and specificity," J. Biol. Chem., 267:2311-2317, 

1992. 

Wu and Dean, "Functional significance of loops in the receptor binding domain of Bacillus 

thuringiensis CrylllA 6-endotoxin," J. Mol. Biol., 255:628-640, 1996. 
Yamada et al, Plant Cell Rep., 4:85, 1986. 
20 Yang et al. , Proc. Natl. Acad. Sci. USA, 87:4 1 44-48, 1 990. 
Yu etal, Proc. Natl. Acad. Sci. USA, 90:6340-4, 1993. 

Zatloukal, Wagner, Cotten, Phillips, Plank, Steinlein, Curiel, Bimstiel, "Transferrinfection: a 
highly efficient way to express gene constructs in eukaryotic cells," Ann. N. Y. 
Acad. Sci., 660:136-153, 1992. 
25 Zhang and Matthews, "Conservations of solvent-binding sites in 10 crystal forms of T4 ly- 
sozyme," Prot. Sci., 3:1031-1039, 1994. 

Zhou et al, Mol. Cell Biol, 10:4529-37, 1990. 



A I35535<2WKV01! DOC) 



-207- 



8.0 Sequence Listing 



(1) GENERAL INFORMATION: 

(i) APPLICANT: English, Leigh H. 

Brussock, Susan M. 
Malvar, Thomas M. 
Bryson, James W. 
Kulesza, Caroline A. 
Walters, Frederick S. 
Slatin, Stephen L. 
Von Tersch, Michael A. 
Romano , Cha r 1 e s 

(ii) TITLE OF INVENTION: NUCLEIC ACID SEGMENTS ENCODING MODIFIED 
COLEOPTERAN- TOXIC CRYSTAL PROTEINS 

(iii) NUMBER OF SEQUENCES: 113 

(iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: Arnold, White & Durkee 

(B) STREET: P.O. Box 4433 

(C) CITY: Houston 

(D) STATE: Texas 

(E) COUNTRY: USA 

(F) ZIP: 77210 

(v) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC-DOS/MS-DOS 

(D) SOFTWARE: Patentln Release #1.0, Version #1.30 

(vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: US Unknown 

(B) FILING DATE: Concurrently Herewith 

(C) CLASSIFICATION: Unknown 

(viii) ATTORNEY/ AGENT INFORMATION: 

(A) NAME: Kitchell, Barbara S. 

(B) REGISTRATION NUMBER: 33,928 

(C) REFERENCE /DOCKET NUMBER: MECO:149 

(ix) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: 512/418-3106 

(B) TELEFAX: 512/474-7577 
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(2) INFORMATION FOR SEQ ID NO : 1 : 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1959 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

( D ) TOPOLOGY : 1 inear 

(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 1. .1956 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 1 : 

ATG AAT CCA AAC AAT CGA AGT GAA CAT GAT ACG ATA AAG GTT ACA CCT 4 8 

Met Asri Pro Asn Asn Arg Ser Glu His Asp Thr lie Lys Val Thr Pro 
15 10 15 

AAC AGT GAA TTG CAA ACT AAC CAT AAT CAA TAT CCT TTA GCT GAC AAT 96 
Asn Ser Glu Leu Gin Thr Asn His Asn Gin Tyr Pro Leu- Ala Asp Asn 
20 25 30 

CCA AAT TCA ACA CTA GAA GAA TTA AAT TAT AAA GAA TTT TTA AGA ATG 144 
Pro Asn Ser Thr Leu Glu Glu Leu Asn Tyr Lys Glu Phe Leu Arg Met 
35 40 45 

ACT GAA GAC AGT TCT ACG GAA GTG CTA GAC AAC TCT ACA GTA AAA GAT 192 
Thr Glu Asp Ser Ser Thr Glu Val Leu Asp Asn Ser Thr Val Lys Asp 
50 55 60 

GCA GTT GGG ACA GGA ATT TCT GTT GTA GGG CAG ATT TTA GGT GTT GTA 24 0 

Ala Val Gly Thr Gly He Ser Val Val Gly Gin He Leu Gly Val Val 
65 70 75 80 

GGA GTT CCA TTT GCT GGG GCA CTC ACT TCA TTT TAT CAA TCA TTT CTT 288 
Gly Val Pro Phe Ala Gly Ala Leu Thr Ser Phe Tyr Gin Ser Phe Leu 
85 90 95 

AAC ACT ATA TGG CCA AGT GAT GCT GAC CCA TGG AAG GCT TTT ATG GCA 3 36 

Asn Thr He Trp Pro Ser Asp Ala Asp Pro Trp Lys Ala Phe Met Ala 
100 105 110 

CAA GTT GAA GTA CTG ATA GAT AAG AAA ATA GAG GAG TAT GCT AAA AGT 384 
Gin Val Glu Val Leu He Asp Lys Lys He Glu Glu Tyr Ala Lys Ser 
115 • 120 125 

AAA GCT CTT GCA GAG TTA CAG GGT CTT CAA AAT AAT TTC GAA GAT TAT 432 
Lys Ala Leu Ala Glu Leu Gin Gly Leu Gin Asn Asn Phe Glu Asp Tyr 
130 135 140 

GTT AAT GCG TTA AAT TCC TGG AAG AAA TTT CAC CAT TCT CGT CGT TCT 4 80 

Val Asn Ala Leu Asn Ser Trp Lys Lys Phe His His Ser Arg Arg Ser 
145 150 155 160 



-209- 

A: I35535(2WKV0I» DOC) 



AAA AGA AGC CAA GAT CGA ATA AGG GAA CTT TTT TCT CAA GCA GAA AGT 528 
Lys Arg Ser Gin Asp Arg lie Arg Glu Leu Phe Ser Gin Ala Glu Ser 
165 170 175 

CAT TTT CGT AAT TCC ATG CCG TCA TTT GCA GTT TCC AAA TTC GAA GTG 576 
His Phe Arg Asn Ser Met Pro Ser Phe Ala Val Ser Lys Phe Glu Val 
180 185 190 

CTG TTT CTA CCA ACA TAT GCA CAA GCT GCA AAT ACA CAT TTA TTG CTA 624 
Leu Phe Leu Pro Thr Tyr Ala Gin Ala Ala Asn Thr His Leu Leu Leu 
195 200 205 

TTA AAA GAT GCT CAA GTT TTT GGA GAA GAA TGG GGA TAT TCT TCA GAA 672 
Leu Lys Asp Ala Gin Val Phe Gly Glu Glu Trp Gly Tyr Ser Ser Glu 
210 215 220 

GAT GTT GCT GAA TTT TAT CAT AGA CAA TTA AAA CTT ACA CAA CAA TAC 720 
Asp Val Ala Glu Phe Tyr His Arg Gin Leu Lys Leu Thr Gin Gin Tyr 
225 230 235 240 

ACT GAC CAT TGT GTT AAT TGG TAT AAT GTT GGA TTA AAT GGT TTA AGA 76 8 

Thr Asp His Cys Val Asn Trp Tyr Asn Val Gly Leu Asn Gly Leu Arg 
245 250 255 

GGT TCA ACT TAT GAT GCA TGG GTC AAA TTT AAC CGT TTT CGC AGA GAA 816 
Gly Ser Thr Tyr Asp Ala Trp Val Lys Phe Asn Arg Phe Arg Arg Glu 
260 265 270 

ATG ACT TTA ACT GTA TTA GAT CTA ATT GTA CTT TTC CCA TTT TAT GAT 864 
Met Thr Leu Thr Val Leu Asp Leu lie Val Leu Phe Pro Phe Tyr Asp 
275 280 285 

ATT CGG TTA TAC TCA AAA GGG GTT AAA ACA GAA CTA ACA AGA GAC ATT 912 
lie Arg Leu Tyr Ser Lys Gly Val Lys Thr Glu Leu Thr Arg Asp lie 
290 295 300 

TTT ACG GAT CCA ATT TTT TCA CTT AAT ACT CTT CAG GAG TAT GGA CCA 960 
Phe Thr Asp Pro lie Phe Ser Leu Asn Thr Leu Gin Glu Tyr Gly Pro 
305 310 315 320 

ACT TTT TTG AGT ATA GAA AAC TCT ATT CGA AAA CCT CAT TTA TTT GAT 1008 
Thr Phe Leu Ser lie Glu Asn Ser lie Arg Lys Pro His Leu Phe Asp 
325 330 335 

TAT TTA CAG GGG ATT GAA TTT CAT ACG CGT CTT CAA CCT GGT TAC TTT 1056 
Tyr Leu Gin Gly lie Glu Phe His Thr Arg Leu Gin Pro Gly Tyr Phe 
340 345 350 

GGG AAA GAT TCT TTC AAT TAT TGG TCT GGT AAT TAT GTA GAA ACT AGA 1104 
Gly Lys Asp Ser Phe Asn Tyr Trp Ser Gly Asn Tyr Val Glu Thr Arg 
355 360 365 



A I35535<2WKV0H DOC) 



-210- 



CCT AGT ATA GGA TCT AGT AAG ACA ATT ACT TCC CCA TTT TAT GGA GAT 1152 
Pro Ser lie Gly Ser Ser Lys Thr lie Thr Ser Pro Phe Tyr Gly Asp 
370 375 380 

AAA TCT ACT GAA CCT GTA CAA AAG CTA AGC TTT GAT GGA CAA AAA GTT 12 00 

Lys Ser Thr Glu Pro Val Gin Lys Leu Ser Phe Asp Gly Gin Lys Val 
385 390 395 400 

TAT CGA ACT ATA GCT AAT ACA GAC GTA GCG GCT TGG CCG AAT GGT AAG 124 8 

Tyr Arg Thr lie Ala Asn Thr Asp Val Ala Ala Trp Pro' Asn Gly Lys 
405 410 415 

GTA TAT TTA GGT GTT ACG AAA GTT GAT TTT AGT CAA TAT GAT GAT CAA 12 96 

Val Tyr Leu Gly Val Thr Lys Val Asp Phe Ser Gin Tyr Asp Asp Gin 
420 425 430 

AAA AAT GAA ACT AGT ACA CAA ACA TAT GAT TCA AAA AGA AAC AAT GGC 13 44 

Lys Asn Glu Thr Ser Thr Gin Thr Tyr Asp Ser Lys Arg Asn Asn Gly 
435 440 445 

CAT GTA AGT GCA CAG GAT TCT ATT GAC CAA TTA CCG CCA GAA ACA ACA 13 92 

His Val Ser Ala Gin Asp Ser lie Asp Gin Leu Pro Pro Glu Thr Thr 
450 455 460 

GAT GAA CCA CTT GAA AAA GCA TAT AGT CAT CAG CTT AAT TAC GCG GAA 144 0 

Asp Glu Pro Leu Glu Lys Ala Tyr Ser His Gin Leu Asn Tyr Ala Glu 
465 470 475 480 

TGT TTC TTA ATG CAG GAC CGT CGT GGA ACA ATT CCA TTT TTT ACT TGG 1488 
Cys Phe Leu Met Gin Asp Arg Arg Gly Thr lie Pro Phe Phe Thr Trp 
485 490 495 

ACA CAT AGA AGT GTA GAC TTT TTT AAT ACA ATT GAT GCT GAA AAG ATT 1536 
Thr His Arg Ser Val Asp Phe Phe Asn Thr lie Asp Ala Glu Lys lie 
500 505 510 

ACT CAA CTT CCA GTA GTG AAA GCA TAT GCC TTG TCT TCA GGT GCT TCC 1584 
Thr Gin Leu Pro Val Val Lys Ala Tyr Ala Leu Ser Ser Gly Ala Ser 
515 520 525 

ATT ATT GAA GGT CCA GGA TTC ACA GGA GGA AAT TTA CTA TTC CTA AAA 1632 
lie lie Glu Gly Pro Gly Phe Thr Gly Gly Asn Leu Leu Phe Leu Lys 
530 535 540 

GAA TCT AGT AAT TCA ATT GCT AAA TTT AAA GTT ACA TTA AAT TCA GCA 1680 
Glu Ser Ser Asn Ser lie Ala Lys Phe Lys Val Thr Leu Asn Ser Ala 
545 550 555 560 

GCC TTG TTA CAA CGA TAT CGT GTA AGA ATA CGC TAT GCT TCT ACC ACT 1728 
Ala Leu Leu Gin Arg Tyr Arg Val Arg lie Arg Tyr Ala Ser Thr Thr 
565 570 575 
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AAC TTA CGA CTT TTT GTG CAA AAT TCA AAC AAT GAT TTT CTT GTC ATC 
Asn Leu Arg Leu Phe Val Gin Asn Ser Asn Asn Asp Phe Leu Val lie 
580 585 590 



1776 



TAC ATT AAT AAA ACT ATG 
Tyr lie Asn Lys Thr Met 
595 

TTT GAT CTC GCA ACT ACT 
Phe Asp Leu Ala Thr Thr 
610 

AAT GAA CTT ATA ATA GGA 
Asn Glu Leu lie lie Gly 
625 630 



AAT AAA GAT GAT GAT TTA 
Asn Lys Asp Asp Asp Leu 
600 

AAT TCT AAT ATG GGG TTC 
Asn Ser Asn Met Gly Phe 
615 620 

GCA GAA TCT TTC GTT TCT 
Ala Glu Ser Phe Val Ser 
635 



AC A TAT CAA ACA 182 4 

Thr Tyr Gin Thr 

605 

TCG GGT GAT AAG 18 72 

Ser Gly Asp Lys 



AAT GAA AAA ATC 1920 
Asn Glu Lys lie 
640 



TAT ATA GAT AAG ATA GAA TTT ATC CCA GTA CAA TTG TAA 1959 
Tyr lie Asp Lys lie Glu Phe lie Pro Val Gin Leu 
645 650 



(2) INFORMATION FOR SEQ ID NO : 2 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6 52 amino acids 

(B) TYPE: amino acid 
<D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 2 : 

Met Asn Pro Asn Asn Arg Ser Glu His Asp Thr lie Lys Val Thr Pro 
15 10 15 

Asn Ser Glu Leu Gin Thr Asn His Asn Gin Tyr Pro Leu Ala Asp Asn 
20 25 30 

Pro Asn Ser Thr Leu Glu Glu Leu Asn Tyr Lys Glu Phe Leu Arg Met 
35 40 45 

Thr Glu Asp Ser Ser Thr Glu Val Leu Asp Asn Ser Thr Val Lys Asp 
50 55 60 

Ala Val Gly Thr Gly He Ser Val Val Gly Gin He Leu Gly Val Val 
65 70 75 80 

Gly Val Pro Phe Ala Gly Ala Leu Thr Ser Phe Tyr Gin Ser Phe Leu 
85 90 95 

Asn Thr He Trp Pro Ser Asp Ala Asp Pro Trp Lys Ala Phe Met Ala 
100 105 110 

Gin Val Glu Val Leu He Asp Lys Lys He Glu Glu Tyr Ala Lys Ser 
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115 



120 



125 



Lys Ala Leu Ala Glu Leu Gin Gly Leu Gin Asn Asn Phe Glu Asp Tyr 
130 135 140 

Val Asn Ala Leu Asn Ser Trp Lys Lys Phe His His Ser Arg Arg Ser 
145 150 155 160 

Lys Arg Ser Gin Asp Arg lie Arg Glu Leu Phe Ser Gin Ala Glu Ser 
165 170 175 

His Phe Arg Asn Ser Met Pro Ser Phe Ala Val Ser Lys Phe Glu Val 
180 185 190 

Leu Phe Leu Pro Thr Tyr Ala Gin Ala Ala Asn Thr His Leu Leu Leu 
195 200 205 

Leu Lys Asp Ala Gin Val Phe Gly Glu Glu Trp Gly Tyr Ser Ser Glu 
210 215 220 

Asp Val Ala Glu Phe Tyr His Arg Gin Leu Lys Leu Thr Gin Gin Tyr 
225 230 235 240 

Thr Asp His Cys Val Asn Trp Tyr Asn Val Gly Leu Asn Gly Leu Arg 
245 250 255 

Gly Ser Thr Tyr Asp Ala Trp Val Lys Phe Asn Arg Phe Arg Arg Glu 
260 265 270 

Met Thr Leu Thr Val Leu Asp Leu lie Val Leu Phe Pro Phe Tyr Asp 
275 280 285 

lie Arg Leu Tyr Ser Lys Gly Val Lys Thr Glu Leu Thr Arg Asp lie 
290 295 300 

Phe Thr Asp Pro lie Phe Ser Leu Asn Thr Leu Gin Glu Tyr Gly Pro 
305 310 315 320 

Thr Phe Leu Ser lie Glu Asn Ser lie Arg Lys Pro His Leu Phe Asp 
325 330 335 

Tyr Leu Gin Gly lie Glu Phe His Thr Arg Leu Gin Pro Gly Tyr Phe 
340 345 350 

Gly Lys Asp Ser Phe Asn Tyr Trp Ser Gly Asn Tyr Val Glu Thr Arg 
355 360 365 

Pro Ser He Gly Ser Ser Lys Thr He Thr Ser Pro Phe Tyr Gly Asp 
370 375 380 

Lys Ser Thr Glu Pro Val Gin Lys Leu Ser Phe Asp Gly Gin Lys Val 
385 390 395 400 

Tyr Arg Thr He Ala Asn Thr Asp Val Ala Ala Trp Pro Asn Gly Lys 
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405 



410 



415 



Val Tyr Leu Gly Val Thr Lys Val Asp Phe Ser Gin Tyr Asp Asp Gin 
420 425 430 

Lys Asn Glu Thr Ser Thr Gin Thr Tyr Asp Ser Lys Arg Asn Asn Gly 
435 440 445 

His Val Ser Ala Gin Asp Ser lie Asp Gin Leu Pro Pro Glu Thr Thr 
450 455 460 

Asp Glu Pro Leu Glu Lys Ala Tyr Ser His Gin Leu Asn Tyr Ala Glu 
465 470 475 480 

Cys Phe Leu Met Gin Asp Arg Arg Gly Thr lie Pro Phe Phe Thr Trp 
485 490 495 

Thr His Arg Ser Val Asp Phe Phe Asn Thr lie Asp Ala Glu Lys lie 
500 505 510 

Thr Gin Leu Pro Val Val Lys Ala Tyr Ala Leu Ser Ser Gly Ala Ser 
515 520 525 

lie lie Glu Gly Pro Gly Phe Thr Gly Gly Asn Leu Leu Phe Leu Lys 
530 535 540 

Glu Ser Ser Asn Ser lie Ala Lys Phe Lys Val Thr Leu Asn Ser Ala 
545 550 555 560 

Ala Leu Leu Gin Arg Tyr Arg Val Arg lie Arg Tyr Ala Ser Thr Thr 
565 570 575 

Asn Leu Arg Leu Phe Val Gin Asn Ser Asn Asn Asp Phe Leu Val lie 
580 585 590 

Tyr lie Asn Lys Thr Met Asn Lys Asp Asp Asp Leu Thr Tyr Gin Thr 
595 600 605 

Phe Asp Leu Ala Thr Thr Asn Ser Asn Met Gly Phe Ser Gly Asp Lys 
610 615 620 

Asn Glu Leu lie lie Gly Ala Glu Ser Phe Val Ser Asn Glu Lys He 
625 630 635 640 

Tyr He Asp Lys He Glu Phe He Pro Val Gin Leu 
645 650 



(2) INFORMATION FOR SEQ ID NO: 3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1959 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 
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(D) TOPOLOGY: 



linear 



(ix) FEATURE : 

(A) NAME /KEY : CDS 

(B) LOCATION: 1 . . 1956 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 3 : 

ATG AAT CCA AAC AAT CGA AGT GAA CAT GAT ACG ATA AAG GTT ACA CCT 4 8 

Met Asn Pro Asn Asn Arg Ser Glu His Asp Thr lie Lys Val Thr Pro 
15 10 15 

AAC AGT GAA TTG CAA ACT AAC CAT AAT CAA TAT CCT TTA GCT GAC AAT 96 
Asn Ser Glu Leu Gin Thr Asn His Asn Gin Tyr Pro Leu Ala Asp Asn 
20 25 30 

CCA AAT TCA ACA CTA GAA GAA TTA AAT TAT AAA GAA TTT TTA AG A ATG 144 
Pro Asn Ser Thr Leu Glu Glu Leu Asn Tyr Lys Glu Phe Leu Arg Met 
35 40 45 

ACT GAA GAC AGT TCT ACG GAA GTG CTA GAC AAC TCT ACA GTA AAA GAT 192 
Thr Glu Asp Ser Ser Thr Glu Val Leu Asp Asn Ser Thr Val Lys Asp 
50 55 60 

GCA GTT GGG ACA GGA ATT TCT GTT GTA GGG CAG ATT TTA GGT GTT GTA 24 0 

Ala Val Gly Thr Gly He Ser Val Val Gly Gin He Leu Gly Val Val 
65 70 75 80 

GGA GTT CCA TTT GCT GGG GCA CTC ACT TCA TTT TAT CAA TCA TTT CTT 288 
Gly Val Pro Phe Ala Gly Ala Leu Thr Ser Phe Tyr Gin Ser Phe Leu 
85 90 95 

AAC ACT ATA TGG CCA AGT GAT GCT GAC CCA TGG AAG GCT TTT ATG GCA 3 36 

Asn Thr He Trp Pro Ser Asp Ala Asp Pro Trp Lys Ala Phe Met Ala 
100 105 110 

CAA GTT GAA GTA CTG ATA GAT AAG AAA ATA GAG GAG TAT GCT AAA AGT 3 84 

Gin Val Glu Val Leu He Asp Lys Lys He Glu Glu Tyr Ala Lys Ser 
115 120 125 

AAA GCT CTT GCA GAG TTA CAG GGT CTT CAA AAT AAT TTC GAA GAT TAT 4 32 

Lys Ala Leu Ala Glu Leu Gin Gly Leu Gin Asn Asn Phe Glu Asp Tyr 
130 135 140 

GTT AAT GCG TTA AAT TCC TGG AAG AAA ACA CCT TTA AGT TTG CGA AGT 4 80 

Val Asn Ala Leu Asn Ser Trp Lys Lys Thr Pro Leu Ser Leu Arg Ser 
145 150 155 160 

AAA AGA AGC CAA GAT CGA ATA AGG GAA CTT TTT TCT CAA GCA GAA AGT 52 8 

Lys Arg Ser Gin Asp Arg He Arg Glu Leu Phe Ser Gin Ala Glu Ser 
165 170 175 

CAT TTT CGT AAT TCC ATG CCG TCA TTT GCA GTT TCC AAA TTC GAA GTG 5 76 

His Phe Arg Asn Ser Met Pro Ser Phe Ala Val Ser Lys Phe Glu Val 
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180 



185 



190 



CTG TTT CTA CCA ACA TAT GCA CAA GCT GCA AAT ACA CAT TTA TTG CTA 624 
Leu Phe Leu Pro Thr Tyr Ala Gin Ala Ala Asn Thr His Leu Leu Leu 
195 200 205 

TTA AAA GAT GCT CAA GTT TTT GGA GAA GAA TGG GGA TAT TCT TCA GAA 67 2 

Leu Lys Asp Ala Gin Val Phe Gly Glu Glu Trp Gly Tyr Ser Ser Glu 
210 215 220 

GAT GTT GCT GAA TTC CTT AGT AGA CAA TTA AAA CTT ACA CAA CAA TAC 720 
Asp Val Ala Glu Phe Leu Ser Arg Gin Leu Lys Leu Thr Gin Gin Tyr 
225 230 235 240 

ACT GAC CAT TGT GTT AAT TGG TAT AAT GTT GGA TTA AAT GGT TTA AGA 768 
Thr Asp His Cys Val Asn Trp Tyr Asn Val Gly Leu Asn Gly Leu Arg 
245 250 255 

GGT TCA ACT TAT GAT GCA TGG GTC AAA TTT AAC CGT TTT CGC AGA GAA 816 
Gly Ser Thr Tyr Asp Ala Trp Val Lys Phe Asn Arg Phe Arg Arg Glu 
260 265 270 

ATG ACT TTA ACT GTA TTA GAT CTA ATT GTA CTT TTC CCA TTT TAT GAT 864 
Met Thr Leu Thr Val Leu Asp Leu lie Val Leu Phe Pro Phe Tyr Asp 
275 21 j 285 

ATT CGG TTA TAC TCA AAA GGG GTT AAA ACA GAA CTA ACA AGA GAC ATT 912 
lie Arg Leu Tyr Ser Lys Gly Val Lys Thr Glu Leu Thr Arg Asp lie 
290 295 300 

TTT ACG GAT CCA ATT TTT TCA CTT AAT ACT CTT CAG GAG TAT GGA CCA 960 
Phe Thr Asp Pro lie Phe Ser Leu Asn Thr Leu Gin Glu Tyr Gly Pro 
305 310 315 320 

ACT TTT TTG AGT ATA GAA AAC TCT ATT CGA AAA CCT CAT TTA TTT GAT 1008 
Thr Phe Leu Ser lie Glu Asn Ser lie Arg Lys Pro His Leu Phe Asp 
325 330 335 

TAT TTA CAG GGG ATT GAA TTT CAT ACG CGT CTT CAA CCT GGT TAC TTT 1056 
Tyr Leu Gin Gly lie Glu Phe His Thr Arg Leu Gin Pro Gly Tyr Phe 
340 345 350 

GGG AAA GAT TCT TTC AAT TAT TGG TCT GGT AAT TAT GTA GAA ACT AGA 1104 
Gly Lys Asp Ser Phe Asn Tyr Trp Ser Gly Asn Tyr Val Glu Thr Arg 
355 360 365 

CCT AGT ATA GGA TCT AGT AAG ACA ATT ACT TCC CCA TTT TAT GGA GAT 1152 
Pro Ser lie Gly Ser Ser Lys Thr lie Thr Ser Pro Phe Tyr Gly Asp 
370 375 380 

AAA TCT ACT GAA CCT GTA CAA AAG CTA AGC TTT GAT GGA CAA AAA GTT 12 00 

Lys Ser Thr Glu Pro Val Gin Lys Leu Ser Phe Asp Gly Gin Lys Val 
385 390 395 400 
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TAT CGA ACT ATA GCT AAT ACA GAC GTA GCG GCT TGG CCG AAT GGT AAG 124 8 

Tyr Arg Thr lie Ala Asn Thr Asp Val Ala Ala Trp Pro Asn Gly Lys 
405 410 415 

GTA TAT TTA GGT GTT ACG AAA GTT GAT TTT AGT CAA TAT GAT GAT CAA 1296 
Val Tyr Leu Gly Val Thr Lys Val Asp Phe Ser Gin Tyr Asp Asp Gin 
420 425 430 

AAA AAT GAA ACT AGT ACA CAA ACA TAT GAT TCA AAA AGA AAC AAT GGC 13 44 

Lys Asn Glu Thr Ser Thr Gin Thr Tyr Asp Ser Lys Arg Asn Asn Gly 
435 440 445 

CAT GTA AGT GCA CAG GAT TCT ATT GAC CAA TTA CCG CCA GAA ACA ACA 13 92 

His Val Ser Ala Gin Asp Ser lie Asp Gin Leu Pro Pro Glu Thr Thr 
450 455 460 

GAT GAA CCA CTT GAA AAA GCA TAT AGT CAT CAG CTT AAT TAC GCG GAA 144 0 

Asp Glu Pro Leu Glu Lys Ala Tyr Ser His Gin Leu Asn Tyr Ala Glu 
465 470 475 480 

TGT TTC TTA ATG CAG GAC CGT CGT GGA ACA ATT CCA TTT TTT ACT TGG 1488 
Cys Phe Leu Met Gin Asp Arg Arg Gly Thr lie Pro Phe Phe Thr Trp 
485 490 495 

ACA CAT AGA AGT GTA GAC TTT TTT AAT ACA ATT GAT GCT GAA AAG ATT 1536 
Thr His Arg Ser Val Asp Phe Phe Asn Thr lie Asp Ala Glu Lys lie 
500 505 510 

ACT CAA CTT CCA GTA GTG AAA GCA TAT GCC TTG TCT TCA GGT GCT TCC 1584 
Thr Gin Leu Pro Val Val Lys Ala Tyr Ala Leu Ser Ser Gly Ala Ser 
515 520 525 

ATT ATT GAA GGT CCA GGA TTC ACA GGA GGA AAT TTA CTA TTC CTA AAA 163 2 

lie lie Glu Gly Pro Gly Phe Thr Gly Gly Asn Leu Leu Phe Leu Lys 
530 535 540 

GAA TCT AGT AAT TCA ATT GCT AAA TTT AAA GTT ACA TTA AAT TCA GCA 1680 
Glu Ser Ser Asn Ser lie Ala Lys Phe Lys Val Thr Leu Asn Ser Ala 
545 550 555 560 

GCC TTG TTA CAA CGA TAT CGT GTA AGA ATA CGC TAT GCT TCT ACC ACT 1728 
Ala Leu Leu Gin Arg-Tyr Arg Val Arg lie Arg Tyr Ala Ser Thr Thr 
565 570 575 

AAC TTA CGA CTT TTT GTG CAA AAT TCA AAC AAT GAT TTT CTT GTC ATC 1776 
Asn Leu Arg Leu Phe Val Gin Asn Ser Asn Asn Asp Phe Leu Val lie 
580 585 590 

TAC ATT AAT AAA ACT ATG AAT AAA GAT GAT GAT TTA ACA TAT CAA ACA 1824 
Tyr lie Asn Lys Thr Met Asn Lys Asp Asp Asp Leu Thr Tyr Gin Thr 
595 600 605 
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TTT GAT CTC GCA ACT ACT AAT TCT AAT ATG GGG TTC TCG GGT GAT AAG 
Phe Asp Leu Ala Thr Thr Asn Ser Asn Met Gly Phe Ser Gly Asp Lys 
610 615 620 



1872 



AAT GAA CTT ATA ATA GGA GCA GAA TCT TTC GTT TCT AAT GAA AAA ATC 1920 
Asn Glu Leu lie lie Gly Ala Glu Ser Phe Val Ser Asn Glu Lys lie 
625 630 635 640 

TAT ATA GAT AAG ATA GAA TTT ATC CCA GTA CAA TTG TAA 1959 
Tyr lie Asp Lys lie Glu Phe lie Pro Val Gin Leu 
645 650 



(2) INFORMATION FOR SEQ ID NO : 4 : 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 6 52 amino acids 

(B) TYPE:, amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 4 : 

Met Asn Pro Asn Asn Arg Ser Glu His Asp Thr lie Lys Val Thr Pro 
15 10 15 

Asn Ser Glu Leu Gin Thr Asn His Asn Gin Tyr Pro Leu Ala Asp Asn 
20 25 30 

Pro Asn Ser Thr Leu Glu Glu Leu Asn Tyr Lys Glu Phe Leu Arg Met 
35 40 45 

Thr Glu Asp Ser Ser Thr Glu Val Leu Asp Asn Ser Thr Val Lys Asp 
50 55 60 

Ala Val Gly Thr Gly He Ser Val Val Gly Gin He Leu Gly Val Val 
65 70 75 80 

Gly Val Pro Phe Ala Gly Ala Leu Thr Ser Phe Tyr Gin Ser Phe Leu 
85 90 95 

Asn Thr He Trp Pro Ser Asp Ala Asp Pro Trp Lys Ala Phe Met Ala 
100 105 110 

Gin Val Glu Val Leu He Asp Lys Lys He Glu Glu Tyr Ala Lys Ser 
115 120 125 

Lys Ala Leu Ala Glu Leu Gin Gly Leu Gin Asn Asn Phe Glu Asp Tyr 
130 135 140 

Val Asn Ala Leu Asn Ser Trp Lys Lys Thr Pro Leu Ser Leu Arg Ser 
145 150 155 160 
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Lys Arg Ser Gin Asp Arg lie Arg Glu Leu Phe Ser Gin Ala Glu Ser 
165 170 175 



His Phe Arg Asn Ser Met Pro Ser Phe Ala Val Ser Lys Phe Glu Val 
180 185 . 190 

Leu Phe Leu Pro Thr Tyr Ala Gin Ala Ala Asn Thr His Leu Leu Leu 
195 200 205 

Leu Lys Asp Ala Gin Val Phe Gly Glu Glu Trp Gly Tyr Ser Ser Glu 
210 215 220 

Asp Val Ala Glu Phe Leu Ser Arg Gin Leu Lys Leu Thr Gin Gin Tyr 
225 230 235 240 

Thr Asp His Cys Val Asn Trp Tyr Asn Val Gly Leu Asn Gly Leu Arg 
245 250 255 

Gly Ser Thr Tyr Asp Ala Trp Val Lys Phe Asn Arg Phe Arg Arg Glu 
260 265 270 

Met Thr Leu Thr Val Leu Asp Leu lie Val Leu Phe Pro Phe Tyr Asp 
275 280 285 

He Arg Leu Tyr Ser Lys Gly Val Lys Thr Glu Leu Thr Arg Asp He 
290 295 300 

Phe Thr Asp Pro He Phe Ser Leu Asn Thr Leu Gin Glu Tyr Gly Pro 
305 310 315 320 

Thr Phe Leu Ser He Glu Asn Ser He Arg Lys Pro His Leu Phe Asp 
325 330 335 

Tyr Leu Gin Gly He Glu Phe His Thr Arg Leu Gin Pro Gly Tyr Phe 
340 345 350 

Gly Lys Asp Ser Phe Asn Tyr Trp Ser Gly Asn Tyr Val Glu Thr Arg 
355 360 365 

Pro Ser He Gly Ser Ser Lys Thr lie Thr Ser Pro Phe Tyr Gly Asp 
370 375 380 

Lys Ser Thr Glu Pro Val Gin Lys Leu Ser Phe Asp Gly Gin Lys Val 
385 390 395 400 

Tyr Arg Thr He Ala Asn Thr Asp Val Ala Ala Trp Pro Asn Gly Lys 
405 410 415 

Val Tyr Leu Gly Val Thr Lys Val Asp Phe Ser Gin Tyr Asp Asp Gin 
420 425 430 

Lys Asn Glu Thr Ser Thr Gin Thr Tyr Asp Ser Lys Arg Asn Asn Gly 
435 440 445 
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His Val Ser Ala Gin Asp Ser lie Asp Gin Leu Pro Pro Glu Thr Thr 
450 455 460 



Asp Glu Pro Leu Glu Lys Ala Tyr Ser His Gin Leu Asn Tyr Ala Glu 
465 470 475 480 

Cys Phe Leu Met Gin Asp Arg Arg Gly Thr lie Pro Phe Phe Thr Trp 
485 490 495 

Thr His Arg Ser Val Asp Phe Phe Asn Thr lie Asp Ala Glu Lys lie 
500 505 510 

Thr Gin Leu Pro Val Val Lys Ala Tyr Ala Leu Ser Ser Gly Ala Ser 
515 520 525 

lie lie Glu Gly Pro Gly Phe Thr Gly Gly Asn Leu Leu Phe Leu Lys 
530 535 540 

Glu Ser Ser Asn Ser lie Ala Lys Phe Lys Val Thr Leu Asn Ser Ala 
545 550 555 560 

Ala Leu Leu Gin Arg Tyr Arg Val Arg lie Arg Tyr Ala Ser Thr Thr 
565 570 575 

Asn Leu Arg Leu Phe Val Gin Asn Ser Asn Asn Asp Phe Leu Val He 
580 585 590 

Tyr He Asn Lys Thr Met Asn Lys Asp Asp Asp Leu Thr Tyr Gin Thr 
595 600 605 

Phe Asp Leu Ala Thr Thr Asn Ser Asn Met Gly Phe Ser Gly Asp Lys 
610 615 620 

Asn Glu Leu He He Gly Ala Glu Ser Phe Val Ser Asn Glu Lys He 
625 630 635 640 

Tyr He Asp Lys He Glu Phe He Pro Val Gin Leu 
645 650 



(2) INFORMATION FOR SEQ ID NO : 5 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1959 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ix) FEATURE: 

(A) NAME /KEY : CDS 

(B) LOCATION: 1..1956 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 5 : 



ATG AAT CCA AAC AAT CGA AGT GAA CAT GAT ACG ATA AAG GTT AC A CCT 4 8 

Met Asn Pro Asn Asn Arg Ser Glu His Asp Thr lie Lys Val Thr Pro 
1*5 10 15 

AAC AGT GAA TTG CAA ACT AAC CAT AAT CAA TAT CCT TTA GCT GAC AAT 96 
Asn Ser Glu Leu Gin Thr Asn His Asn Gin Tyr Pro Leu Ala Asp Asn 
20 25 30 

CCA AAT TCA ACA CTA GAA GAA TTA AAT TAT AAA GAA TTT TTA AGA ATG 144 
Pro Asn Ser Thr Leu Glu Glu Leu Asn Tyr Lys Glu Phe Leu Arg Met 
35 40 45 

ACT GAA GAC AGT TCT ACG GAA GTG CTA GAC AAC TCT ACA GTA AAA GAT 192 
Thr Glu Asp Ser Ser Thr Glu Val Leu Asp Asn Ser Thr Val Lys Asp 
50 55 60 

GCA GTT GGG ACA GGA ATT TCT GTT GTA GGG CAG ATT TTA GGT GTT GTA 24 0 

Ala Val Gly Thr Gly He Ser Val Val Gly Gin He Leu Gly Val Val 
65 70 75 80 

GGA GTT CCA TTT GCT GGG GCA CTC ACT TCA TTT TAT CAA TCA TTT CTT 288 
Gly Val Pro Phe Ala Gly Ala Leu Thr Ser Phe Tyr Gin Ser Phe Leu 
85 90 95 

AAC ACT ATA TGG CCA AGT GAT GCT GAC CCA TGG AAG GCT TTT ATG GCA 336 
Asn Thr He Trp Pro Ser Asp Ala Asp Pro Trp Lys Ala Phe Met Ala 
100 105 110 

CAA GTT GAA GTA CTG ATA GAT AAG AAA ATA GAG GAG TAT GCT AAA AGT 3 84 

Gin Val Glu Val Leu He Asp Lys Lys He Glu Glu Tyr Ala Lys Ser 
115 120 125 

AAA GCT CTT GCA GAG TTA CAG GGT CTT CAA AAT AAT TTC GAA GAT TAT 432 
Lys Ala Leu Ala Glu Leu Gin Gly Leu Gin Asn Asn Phe Glu Asp Tyr 
130 135 140 

GTT AAT GCG TTA AAT TCC TGG AAG AAA ACA CCT TTA AGT TTG CGA AGT 480 
Val Asn Ala Leu Asn Ser Trp Lys Lys Thr Pro Leu Ser Leu Arg Ser 
145 150 155 160 

AAA AGA AGC CAA GAT CGA ATA AGG GAA CTT TTT TCT CAA GCA GAA AGT 52 8 

Lys Arg Ser Gin Asp Arg He Arg Glu Leu Phe Ser Gin Ala Glu Ser 
165 170 175 

CAT TTT CGT AAT TCC ATG CCG TCA TTT GCA GTT TCC AAA TTC GAA GTG 576 
His Phe Arg Asn Ser Met Pro Ser Phe Ala Val Ser Lys Phe Glu Val 
180 185 190 

CTG TTT CTA CCA ACA TAT GCA CAA GCT GCA AAT ACA CAT TTA TTG CTA 624 
Leu Phe Leu Pro Thr Tyr Ala Gin Ala Ala Asn Thr His Leu Leu Leu 
195 200 205 
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TTA AAA GAT GCT CAA GTT TTT GGA GAA GAA TGG GGA TAT TCT CCA GAA 672 
Leu Lys Asp Ala Gin Val Phe Gly Glu Glu Trp Gly Tyr Ser Pro Glu 
210 215 220 

GAT GTT GCT GAA TTC AGT CAT AGA CAA TTA AAA CTT ACA CAA CAA TAC 720 
Asp Val Ala Glu Phe Ser His Arg Gin Leu Lys Leu Thr Gin Gin Tyr 
225 230 235 240 

ACT GAC CAT TGT GTT AAT TGG TAT AAT GTT GGA TTA AAT GGT TTA AGA 768 
Thr Asp His Cys Val Asn Trp Tyr Asn Val Gly Leu Asn Gly Leu Arg 
245 250 255 

GGT TCA ACT TAT GAT GCA TGG GTC AAA TTT AAC CGT TTT CGC AGA GAA 816 
Gly Ser Thr Tyr Asp Ala Trp Val Lys Phe Asn Arg Phe Arg Arg Glu 
260 265 270 

ATG ACT TTA ACT GTA TTA GAT CTA ATT GTA CTT TTC CCA TTT TAT GAT 864 
Met Thr Leu Thr Val Leu Asp Leu He Val Leu Phe Pro Phe Tyr Asp 
275 280 285 

ATT CGG TTA TAC TCA AAA GGG GTT AAA ACA GAA CTA ACA AGA GAC ATT 912 
He Arg Leu Tyr Ser Lys Gly Val Lys Thr Glu Leu Thr Arg Asp He 
290 295 300 

TTT ACG GAT CCA ATT TTT TCA CTT AAT ACT CTT CAG GAG TAT GGA CCA 960 
Phe Thr Asp Pro He Phe Ser Leu Asn Thr Leu Gin Glu Tyr Gly Pro 
305 310 315 320 

ACT TTT TTG AGT ATA GAA AAC TCT ATT CGA AAA CCT CAT TTA TTT GAT 1008 
Thr Phe Leu Ser He Glu Asn Ser He Arg Lys Pro His Leu Phe Asp 
325 330 335 

TAT TTA CAG GGG ATT GAA TTT CAT ACG CGT CTT CAA CCT GGT TAC TTT 10 56 

Tyr Leu Gin Gly He Glu Phe His Thr Arg Leu Gin Pro Gly Tyr Phe 
340 345 350 

GGG AAA GAT TCT TTC AAT TAT TGG TCT GGT AAT TAT GTA GAA ACT AGA 1104 
Gly Lys Asp Ser Phe Asn Tyr Trp Ser Gly Asn Tyr Val Glu Thr Arg 
355 360 365 

CCT AGT ATA GGA TCT AGT AAG ACA ATT ACT TCC CCA TTT TAT GGA GAT 1152 
Pro Ser He Gly Ser Ser Lys Thr lie Thr Ser Pro Phe Tyr Gly Asp 
370 375 380 

AAA TCT ACT GAA CCT GTA CAA AAG CTA AGC TTT GAT GGA CAA AAA GTT 1200 
Lys Ser Thr Glu Pro Val Gin Lys Leu Ser Phe Asp Gly Gin Lys Val 
385 390 395 400 

TAT CGA ACT ATA GCT AAT ACA GAC GTA GCG GCT TGG CCG AAT GGT AAG 1248 
Tyr Arg Thr He Ala Asn Thr Asp Val Ala Ala Trp Pro Asn Gly Lys 
405 410 415 
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GTA TAT TTA GGT GTT ACG AAA GTT GAT TTT AGT CAA TAT" GAT GAT CAA 12 96 

Val Tyr Leu Gly Val Thr Lys Val Asp Phe Ser Gin Tyr Asp Asp Gin 
420 425 430 

AAA AAT GAA ACT AGT ACA CAA ACA TAT GAT TCA AAA AGA AAC AAT GGC 1344 
Lys Asn Glu Thr Ser Thr Gin Thr Tyr Asp Ser Lys Arg Asn Asn Gly 
435 440 445 

CAT GTA AGT GCA CAG GAT TCT ATT GAC CAA TTA CCG CCA GAA ACA ACA 13 92 

His Val Ser Ala Gin Asp Ser lie Asp Gin Leu Pro Pro Glu Thr Thr 
450 455 460 

GAT GAA CCA CTT GAA AAA GCA TAT AGT CAT CAG CTT AAT TAC GCG GAA 1440 
Asp Glu Pro Leu Glu Lys Ala Tyr Ser His Gin Leu Asn Tyr Ala Glu 
465 470 475 480 

TGT TTC TTA ATG CAG GAC CGT CGT GGA ACA ATT CCA TTT TTT ACT TGG 14 88 

Cys Phe Leu Met Gin Asp Arg Arg Gly Thr lie Pro Phe Phe Thr Trp 
485 490 495 

ACA CAT AGA AGT GTA GAC TTT TTT AAT ACA ATT GAT GCT GAA AAG ATT 1536 
Thr His Arg Ser Val Asp Phe Phe Asn Thr lie Asp Ala Glu Lys lie 
500 505 510 

ACT CAA CTT CCA GTA GTG AAA GCA TAT GCC TTG TCT TCA GGT GCT TCC 1584 
Thr Gin Leu Pro Val Val Lys Ala Tyr Ala Leu Ser Ser Gly Ala Ser 
515 520 525 

ATT ATT GAA GGT CCA GGA TTC ACA GGA GGA AAT TTA CTA TTC CTA AAA 1632 
lie lie Glu Gly Pro Gly Phe Thr Gly Gly Asn Leu Leu Phe Leu Lys 
530 535 540 

GAA TCT AGT AAT TCA ATT GCT AAA TTT AAA GTT ACA TTA AAT TCA GCA 1680 
Glu Ser Ser Asn Ser lie Ala Lys Phe Lys Val Thr Leu Asn Ser Ala 
545 550 555 560 

GCC TTG TTA CAA CGA TAT CGT GTA AGA ATA CGC TAT GCT TCT ACC ACT 1728 
Ala Leu Leu Gin Arg Tyr Arg Val Arg lie Arg Tyr Ala Ser Thr Thr 
565 570 575 

AAC TTA CGA CTT TTT GTG CAA AAT TCA AAC AAT GAT TTT CTT GTC ATC 1776 
Asn Leu Arg Leu Phe Val Gin Asn Ser Asn Asn Asp Phe Leu Val lie 
580 585 590 

TAC ATT AAT AAA ACT ATG AAT AAA GAT GAT GAT TTA ACA TAT CAA ACA 1824 
Tyr lie Asn Lys Thr Met Asn Lys Asp Asp Asp Leu Thr Tyr Gin Thr 
595 600 605 

TTT GAT CTC GCA ACT ACT AAT TCT AAT ATG GGG TTC TCG GGT GAT AAG 1872 
Phe Asp Leu Ala Thr Thr Asn Ser Asn Met Gly Phe Ser Gly Asp Lys 
610 615 620 
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AAT GAA CTT ATA ATA GGA GCA GAA TCT TTC GTT TCT AAT GAA AAA ATC 192 0 

Asn Glu Leu lie lie Gly Ala Glu Ser Phe Val Ser Asn Glu Lys lie 
625 630 635 640 

TAT ATA GAT AAG ATA GAA TTT ATC CCA GTA CAA TTG TAA 1959 

Tyr lie Asp Lys lie Glu Phe lie Pro Val Gin Leu 
645 650 



(2) INFORMATION FOR SEQ ID NO : 6 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6 52 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 6 : 

Met Asn Pro Asn Asn Arg Ser Glu His Asp Thr lie Lys Val Thr Pro 
15 10 15 

Asn Ser Glu Leu Gin Thr Asn His Asn Gin Tyr Pro Leu Ala Asp Asn 
20 25 30 

Pro Asn Ser Thr Leu Glu Glu Leu Asn Tyr Lys Glu Phe Leu Arg Met 
35 40 45 

Thr Glu Asp Ser Ser Thr Glu Val Leu Asp Asn Ser Thr Val Lys Asp 
50 55 60 

Ala Val Gly Thr Gly lie Ser Val Val Gly Gin He Leu Gly Val Val 
65 70 75 80 

Gly Val Pro Phe Ala Gly Ala Leu Thr Ser Phe Tyr Gin Ser Phe Leu 
85 90 95 

Asn Thr He Trp Pro Ser Asp Ala Asp Pro Trp Lys Ala Phe Met Ala 
100 105 110 

Gin Val Glu Val Leu He Asp Lys Lys He Glu Glu Tyr Ala Lys Ser 
115 120 125 

Lys Ala Leu Ala Glu Leu Gin Gly Leu Gin Asn Asn Phe Glu Asp Tyr 
130 135 140 

Val Asn Ala Leu Asn Ser Trp Lys Lys Thr Pro Leu Ser Leu Arg Ser 
145 150 155 160 

Lys Arg Ser Gin Asp Arg He Arg Glu Leu Phe Ser Gin Ala Glu Ser 
165 170 175 
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His Phe Arg Asn Ser Met Pro Ser Phe Ala Val Ser Lys Phe Glu Val 
180 185 190 



Leu Phe Leu Pro Thr Tyr Ala Gin Ala Ala Asn Thr His Leu Leu Leu 
195 200 205 

Leu Lys Asp Ala Gin Val Phe Gly Glu Glu Trp Gly Tyr Ser Pro Glu 
210 215 220 

Asp Val Ala Glu Phe Ser His Arg Gin Leu Lys Leu Thr Gin Gin Tyr 
225 230 235 240 

Thr Asp His Cys Val Asn Trp Tyr Asn Val Gly Leu Asn Gly Leu Arg 
245 250 255 

Gly Ser Thr Tyr Asp Ala Trp Val Lys Phe Asn Arg Phe Arg Arg Glu 
260 265 270 

Met Thr Leu Thr Val Leu Asp Leu lie Val Leu Phe Pro Phe Tyr Asp 
275 280 285 

lie Arg Leu Tyr Ser Lys Gly Val Lys Thr Glu Leu Thr Arg Asp lie 
290 295 300 

Phe Thr Asp Pro lie Phe Ser Leu Asn Thr Leu Gin Glu Tyr Gly Pro 
305 310 315 320 

Thr Phe Leu Ser lie Glu Asn Ser lie Arg Lys Pro His Leu Phe Asp 
325 330 335 

Tyr Leu Gin Gly lie Glu Phe His Thr Arg Leu Gin Pro Gly Tyr Phe 
340 345 350 

Gly Lys Asp Ser Phe Asn Tyr Trp Ser Gly Asn Tyr Val Glu Thr Arg 
355 360 365 

Pro Ser lie Gly Ser Ser Lys Thr lie Thr Ser Pro Phe Tyr Gly Asp 
370 375 380 

Lys Ser Thr Glu Pro Val Gin Lys Leu Ser Phe Asp Gly Gin Lys Val 
385 ' 390 395 400 

Tyr Arg Thr lie Ala Asn Thr Asp Val Ala Ala Trp Pro Asn Gly Lys 
405 410 415 

Val Tyr Leu Gly Val Thr Lys Val Asp Phe Ser Gin Tyr Asp Asp Gin 
420 425 430 

Lys Asn Glu Thr Ser Thr Gin Thr Tyr Asp Ser Lys Arg Asn Asn Gly 
435 440 445 

His Val Ser Ala Gin Asp Ser lie Asp Gin Leu Pro Pro Glu Thr Thr 
450 455 460 
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Asp Glu Pro Leu Glu Lys Ala Tyr Ser His Gin Leu Asn Tyr Ala Glu 
465 470 475 480 



Cys Phe Leu Met Gin Asp Arg Arg Gly Thr lie Pro Phe Phe Thr Trp 
485 490 495 

Thr His Arg Ser Val Asp Phe Phe Asn Thr lie Asp Ala Glu Lys lie 
500 505 510 

Thr Gin Leu Pro Val Val Lys Ala Tyr Ala Leu Ser Ser Gly Ala Ser 
515 520 525 

lie lie Glu Gly Pro Gly Phe Thr Gly Gly Asn Leu Leu Phe Leu Lys 
530 535 540 

Glu Ser Ser Asn Ser lie Ala Lys Phe Lys Val Thr Leu Asn Ser Ala 
545 550 555 560 

Ala Leu Leu Gin Arg Tyr Arg Val Arg lie Arg Tyr Ala Ser Thr Thr 
565 570 575 

Asn Leu Arg Leu Phe Val Gin Asn Ser Asn Asn Asp Phe Leu Val lie 
580 585 590 

Tyr lie Asn Lys Thr Met Asn Lys Asp Asp Asp Leu Thr Tyr Gin Thr 
595 600 605 

Phe Asp Leu Ala Thr Thr Asn Ser Asn Met Gly Phe Ser Gly Asp Lys 
610 615 620 

Asn Glu Leu lie lie Gly Ala Glu Ser Phe Val Ser Asn Glu Lys lie 
625 630 635 640 

Tyr lie Asp Lys lie Glu Phe lie Pro Val Gin Leu 
645 650 



(2) INFORMATION FOR SEQ ID NO: 7: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1959 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ix) FEATURE: 

(A) NAME /KEY : CDS 

(B) LOCATION: 1. . 1956 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 7 : 

ATG AAT CCA AAC AAT CGA AGT GAA CAT GAT ACG ATA AAG GTT ACA CCT 
Met Asn Pro Asn Asn Arg Ser Glu His Asp Thr lie Lys Val Thr Pro 
15 10 15 
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AAC AGT GAA TTG CAA ACT AAC CAT AAT CAA TAT CCT TTA GCT GAC AAT 
Asn Ser Glu Leu Gin Thr Asn His Asn Gin Tyr Pro Leu Ala Asp Asn 
20 25 30 

CCA AAT TCA ACA CTA GAA GAA TTA AAT TAT AAA GAA TTT TTA AGA ATG 
Pro Asn Ser Thr Leu Glu Glu Leu Asn Tyr Lys Glu Phe Leu Arg Met 
35 40 45 

ACT GAA GAC AGT TCT ACG GAA GTG CTA GAC AAC TCT ACA GTA AAA GAT 
Thr Glu Asp Ser Ser Thr Glu Val Leu Asp Asn Ser Thr Val Lys Asp 
50 55 60 

GCA GTT GGG ACA GGA ATT TCT GTT GTA GGG CAG ATT TTA GGT GTT GTA 
Ala Val Gly Thr Gly He Ser Val Val Gly Gin He Leu Gly Val Val ' 
65 70 75 80 

GGA GTT CCA TTT GCT GGG GCA CTC ACT TCA TTT TAT CAA TCA TTT CTT 
Gly Val Pro Phe Ala Gly Ala Leu Thr Ser Phe Tyr Gin Ser Phe Leu 
85 90 95 

AAC ACT ATA TGG CCA AGT GAT GCT GAC CCA TGG AAG GCT TTT ATG GCA 
Asn Thr He Trp Pro Ser Asp Ala Asp Pro Trp Lys Ala Phe Met Ala 
100 105 110 

CAA GTT GAA GTA CTG ATA GAT AAG AAA ATA GAG GAG TAT GCT AAA AGT 
Gin Val Glu Val Leu He Asp Lys Lys lie Glu Glu Tyr Ala Lys Ser 
115 120 125 

AAA GCT CTT GCA GAG TTA CAG GGT CTT CAA AAT AAT TTC GAA GAT TAT 
Lys Ala Leu Ala Glu Leu Gin Gly Leu Gin Asn Asn Phe Glu Asp Tyr 
130 135 140 

GTT AAT GCG TTA AAT TCC TGG AAG AAA ACA CCT TTA AGT TTG CGA AGT 
Val Asn Ala Leu Asn Ser Trp Lys Lys Thr Pro Leu Ser Leu Arg Ser 
145 150 155 160 

AAA AGA AGC CAA GAT CGA ATA AGG GAA CTT TTT TCT CAA GCA GAA AGT 
Lys Arg Ser Gin Asp Arg He Arg Glu Leu Phe Ser Gin Ala Glu Ser 
165 170 175 

CAT TTT CGT AAT TCC ATG CCG TCA TTT GCA GTT TCC AAA TTC GAA GTG 
His Phe Arg Asn Ser Met Pro Ser Phe Ala Val Ser Lys Phe Glu Val 
180 185 190 

CTG TTT CTA CCA ACA TAT GCA CAA GCT GCA AAT ACA CAT TTA TTG CTA 
Leu Phe Leu Pro Thr Tyr Ala Gin Ala Ala Asn Thr His Leu Leu Leu 
195 200 205 

TTA AAA GAT GCT CAA GTT TTT GGA GAA GAA TGG GGA TAT TCT TCA GAA 
Leu Lys Asp Ala Gin Val Phe Gly Glu Glu Trp Gly Tyr Ser Ser Glu 
210 215 220 
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GAT GTT GCT GAA TTC TAT CGT AGA CAA TTA AAA CTT ACA CAA CAA TAC 720 
Asp Val Ala Glu Phe Tyr Arg Arg Gin Leu Lys Leu Thr Gin Gin Tyr 
225 230 235 240 

ACT GAC CAT TGT GTT AAT TGG TAT AAT GTT GGA TTA AAT GGT TTA AGA 768 
Thr Asp His Cys Val Asn Trp Tyr Asn Val Gly Leu Asn Gly Leu Arg 
245 250 255 

GGT TCA ACT TAT GAT GCA TGG GTC AAA TTT AAC CGT TTT CGC AGA GAA 816 
Gly Ser Thr Tyr Asp Ala Trp Val Lys Phe Asn Arg Phe Arg Arg Glu 
260 265 270 

ATG ACT TTA ACT GTA TTA GAT CTA ATT GTA CTT TTC CCA TTT TAT GAT 864 
Met Thr Leu Thr Val Leu Asp Leu lie Val Leu Phe Pro Phe Tyr Asp 
275 280 285 

ATT CGG TTA TAC TCA AAA GGG GTT AAA ACA GAA CTA ACA AGA GAC ATT 912 
lie Arg Leu Tyr Ser Lys Gly Val Lys Thr Glu Leu Thr Arg Asp lie 
290 295 300 

TTT ACG GAT CCA ATT TTT TCA CTT AAT ACT CTT CAG GAG TAT GGA CCA 960 
Phe Thr Asp Pro lie Phe Ser Leu Asn Thr Leu Gin Glu Tyr Gly Pro 
305 310 315 320 

ACT TTT TTG AGT ATA GAA AAC TCT ATT CGA AAA CCT CAT TTA TTT GAT 1008 
Thr Phe Leu Ser lie Glu Asn Ser lie Arg Lys Pro His Leu Phe Asp 
325 330 335 

TAT TTA CAG GGG ATT GAA TTT CAT ACG CGT CTT CAA CCT GGT TAC TTT 1056 
Tyr Leu Gin Gly lie Glu Phe His Thr Arg Leu Gin Pro Gly Tyr Phe 
340 345 350 

GGG AAA GAT TCT TTC AAT TAT TGG TCT GGT AAT TAT GTA GAA ACT AGA 1104 
Gly Lys Asp Ser Phe Asn Tyr Trp Ser Gly Asn Tyr Val Glu Thr Arg 
355 360 365 

CCT AGT ATA GGA TCT AGT AAG ACA ATT ACT TCC CCA TTT TAT GGA GAT 1152 
Pro Ser lie Gly Ser Ser Lys Thr lie Thr Ser Pro Phe Tyr Gly Asp 
370 375 380 

AAA TCT ACT GAA CCT GTA CAA AAG CTA AGC TTT GAT GGA CAA AAA GTT 1200 
Lys Ser Thr Glu Pro Val Gin Lys Leu Ser Phe Asp Gly Gin Lys Val 
385 390 395 400 

TAT CGA ACT ATA GCT AAT ACA GAC GTA GCG GCT TGG CCG AAT GGT AAG 1248 
Tyr Arg Thr lie Ala Asn Thr Asp Val Ala Ala Trp Pro Asn Gly Lys 
405 410 415 

GTA TAT TTA GGT GTT ACG AAA GTT GAT TTT AGT CAA TAT GAT GAT CAA 12 96 

Val Tyr Leu Gly Val Thr Lys Val Asp Phe Ser Gin Tyr Asp Asp Gin 
420 425 430 
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AAA AAT GAA ACT AGT ACA CAA ACA TAT GAT TCA AAA AGA AAC AAT GGC 1344 
Lys Asn Glu Thr Ser Thr Gin Thr Tyr Asp Ser Lys Arg Asn Asn Gly 
435 440 445 

CAT GTA AGT GCA CAG GAT TCT ATT GAC CAA TTA CCG CCA GAA ACA ACA 13 92 

His Val Ser Ala Gin Asp Ser lie Asp Gin Leu Pro Pro Glu Thr Thr 
450 455 460 

GAT GAA CCA CTT GAA AAA GCA TAT AGT CAT CAG CTT AAT TAC GCG GAA 144 0 

Asp Glu. Pro Leu Glu Lys Ala Tyr Ser His Gin Leu Asn Tyr Ala Glu 
465 470 475 480 

TGT TTC TTA ATG CAG GAC CGT CGT GGA ACA ATT CCA TTT TTT ACT TGG 14 8 8 

Cys Phe Leu Met Gin Asp Arg Arg Gly Thr lie Pro Phe Phe Thr Trp 
485 490 495 

ACA CAT AGA AGT GTA GAC TTT TTT AAT ACA ATT GAT GCT GAA AAG ATT 1536 
Thr His Arg Ser Val Asp Phe Phe Asn Thr lie Asp Ala Glu Lys lie 
500 505 510 

ACT CAA CTT CCA GTA GTG AAA GCA TAT GCC TTG TCT TCA GGT GCT TCC 1584 
Thr Gin Leu Pro Val Val Lys Ala Tyr Ala Leu Ser Ser Gly Ala Ser 
515 520 525 

ATT ATT GAA GGT CCA GGA TTC ACA GGA GGA AAT TTA CTA TTC CTA AAA 1632 
lie lie Glu Gly Pro Gly Phe Thr Gly Gly Asn Leu Leu Phe Leu Lys 
530 535 540 

GAA TCT AGT AAT TCA ATT GCT AAA TTT AAA GTT ACA TTA AAT TCA GCA 1680 
Glu Ser Ser Asn Ser lie Ala Lys Phe Lys Val Thr Leu Asn Ser Ala 
545 550 555 560 

GCC TTG TTA CAA CGA TAT CGT GTA AGA ATA CGC TAT GCT TCT ACC ACT 1728 
Ala Leu Leu Gin Arg Tyr Arg Val Arg lie Arg Tyr Ala Ser Thr Thr 
565 570 575 

AAC TTA CGA CTT TTT GTG CAA AAT TCA AAC AAT GAT TTT CTT GTC ATC 1776 
Asn Leu Arg Leu Phe Val Gin Asn Ser Asn Asn Asp Phe Leu Val lie 
580 585 590 

TAC ATT AAT AAA ACT ATG AAT AAA GAT GAT GAT TTA ACA TAT CAA ACA 1824 
Tyr lie Asn Lys Thr Met Asn Lys Asp Asp Asp Leu Thr Tyr Gin Thr 
595 600 605 

TTT GAT CTC GCA ACT ACT AAT TCT AAT ATG GGG TTC TCG GGT GAT AAG 18 72 

Phe Asp Leu Ala Thr Thr Asn Ser Asn Met Gly Phe Ser Gly Asp Lys 
610 615 620 

AAT GAA CTT ATA ATA GGA GCA GAA TCT TTC GTT TCT AAT GAA AAA ATC 1920 
Asn Glu Leu He He Gly Ala Glu Ser Phe Val Ser Asn Glu Lys He 
625 630 635 640 



A I3$S3S(2WKVOI»DOC) 



-229- 



TAT ATA GAT AAG ATA GAA TTT ATC CCA GTA CAA TTG TAA 
Tyr lie Asp Lys He Glu Phe He Pro Val Gin Leu 
645 650 



(2) INFORMATION FOR SEQ ID NO : 8 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6 52 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 8 : 

Met Asn Pro Asn Asn Arg Ser Glu His Asp Thr He Lys Val Thr Pro 
15 10 15 

Asn Ser Glu Leu Gin Thr Asn His Asn Gin Tyr Pro Leu Ala Asp Asn 
20 25 30 

Pro Asn Ser Thr Leu Glu Glu Leu Asn Tyr Lys Glu Phe Leu Arg Met 
35 40 45 

Thr Glu Asp Ser Ser Thr Glu Val Leu Asp Asn Ser Thr Val Lys Asp 
50 55 60 

Ala Val Gly Thr Gly He Ser Val Val Gly Gin He Leu Gly Val Val 
65 70 75 80 

Gly Val Pro Phe Ala Gly Ala Leu Thr Ser Phe Tyr Gin Ser Phe Leu 
85 90 95 

Asn Thr He Trp Pro Ser Asp Ala Asp Pro Trp Lys Ala Phe Met Ala 
100 105 110 

Gin Val Glu Val Leu He Asp Lys Lys He Glu Glu Tyr Ala Lys Ser 
115 120 125 

Lys Ala Leu Ala Glu Leu Gin Gly Leu Gin Asn Asn Phe Glu Asp Tyr 
130 135 140 

Val Asn Ala Leu Asn Ser Trp Lys Lys Thr Pro Leu Ser Leu Arg Ser 
145 150 155 160 

Lys Arg Ser Gin Asp Arg He Arg Glu Leu Phe Ser Gin Ala Glu Ser 
165 170 175 

His Phe Arg Asn Ser Met Pro Ser Phe Ala Val Ser Lys Phe Glu Val 
180 185 190 

Leu Phe Leu Pro Thr Tyr Ala Gin Ala Ala Asn Thr His Leu Leu Leu 
195 200 205 
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Leu Lys Asp Ala Gin Val Phe Gly Glu Glu Trp Gly Tyr Ser Ser Glu 
210 215 220 



Asp Val Ala Glu Phe Tyr Arg Arg Gin Leu Lys Leu Thr Gin Gin Tyr 
225 230 235 240 

Thr Asp His Cys Val Asn Trp Tyr Asn Val Gly Leu Asn Gly Leu Arg 
245 250 255 

Gly Ser Thr Tyr Asp Ala Trp Val Lys Phe Asn Arg Phe Arg Arg Glu 
260 265 270 

Met Thr Leu Thr Val Leu Asp Leu lie Val Leu Phe Pro Phe Tyr Asp 
275 280 285 

lie Arg Leu Tyr Ser Lys Gly Val Lys Thr Glu Leu Thr Arg Asp lie 
290 295 300 

Phe Thr Asp Pro He Phe Ser Leu Asn Thr Leu Gin Glu Tyr Gly Pro 
305 310 315 320 

Thr Phe Leu Ser He Glu Asn Ser He Arg Lys Pro His Leu Phe Asp 
325 330 335 

Tyr Leu Gin Gly He Glu Phe His Thr Arg Leu Gin Pro Gly Tyr Phe 
340 345 350 

Gly Lys Asp Ser Phe Asn Tyr Trp Ser Gly Asn Tyr Val Glu Thr Arg 
355 360 365 

Pro Ser He Gly Ser Ser Lys Thr He Thr Ser Pro Phe Tyr Gly Asp 
370 375 380 

Lys Ser Thr Glu Pro Val Gin Lys Leu Ser Phe Asp Gly Gin Lys Val 
385 390 395 400 

Tyr Arg Thr He Ala Asn Thr Asp Val Ala Ala Trp Pro Asn Gly Lys 
405 410 415 

Val Tyr Leu Gly Val Thr Lys Val Asp Phe Ser Gin Tyr Asp Asp Gin 
420 425 430 

Lys Asn Glu Thr Ser Thr Gin Thr Tyr Asp Ser Lys Arg Asn Asn Gly 
435 440 445 

His Val Ser Ala Gin Asp Ser He Asp Gin Leu Pro Pro Glu Thr Thr 
450 455 460 

Asp Glu Pro Leu Glu Lys Ala Tyr Ser His Gin Leu Asn Tyr Ala Glu 
465 470 475 480 

Cys Phe Leu Met Gin Asp Arg Arg Gly Thr He Pro Phe Phe Thr Trp 
485 490 495 
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Thr His Arg Ser 
500 

Thr Gin Leu Pro 
515 

He He Glu Gly 
530 

Glu Ser Ser Asn 
545 

Ala Leu Leu Gin 



Asn Leu Arg Leu 
580 

Tyr He Asn Lys 
595 

Phe Asp Leu Ala 
610 

Asn Glu Leu He 
625 

Tyr He Asp Lys 



Val Asp Phe Phe 



Val Val Lys Ala 
520 

Pro Gly Phe Thr 
535 

Ser He Ala Lys 
550 

Arg Tyr Arg Val 
565 

Phe Val Gin Asn 



Thr Met Asn Lys 
600 

Thr Thr Asn Ser 
615 

He Gly Ala Glu 
630 

He Glu Phe He 
645 



Asn Thr He Asp 
505 

Tyr Ala Leu Ser 



Gly Gly Asn Leu 
540 

Phe Lys Val Thr 
555 

Arg He Arg Tyr 
570 

Ser Asn Asn Asp 
585 

Asp Asp Asp Leu 



Asn Met Gly Phe 
620 

Ser Phe Val Ser 
635 

Pro Val Gin Leu 
650 



Ala Glu Lys He 
510 

Ser Gly Ala Ser 
525 

Leu Phe Leu Lys 



Leu Asn Ser Ala 
560 

Ala Ser Thr Thr 
575 

Phe Leu Val He 
590 

Thr Tyr Gin Thr 
605 

Ser Gly Asp Lys 



Asn Glu Lys He 
640 



(2) INFORMATION FOR SEQ ID NO : 9 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1959 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 1. .1956 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 9 : 

ATG AAT CCA AAC AAT CGA AGT GAA CAT GAT ACG ATA AAG GTT ACA CCT 
Met Asn Pro Asn Asn Arg Ser Glu His Asp Thr lie Lys Val Thr Pro 
15 10 15 

AAC AGT GAA TTG CAA ACT AAC CAT AAT CAA TAT CCT TTA GCT GAC AAT 
Asn Ser Glu Leu Gin Thr Asn His Asn Gin Tyr Pro Leu Ala Asp Asn 
20 25 30 
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CCA AAT TCA ACA CTA GAA GAA TTA AAT TAT AAA GAA TTT TTA AGA ATG 14 4 

Pro Asn Ser Thr Leu Glu Glu Leu Asn Tyr Lys Glu Phe Leu Arg Met 
35 40 45 

ACT GAA GAC AGT TCT ACG GAA GTG CTA GAC AAC TCT ACA GTA AAA GAT 192 
Thr Glu Asp Ser Ser Thr Glu Val Leu Asp Asn Ser Thr Val Lys Asp 
50 55 60 

GCA GTT GGG ACA GGA ATT TCT GTT GTA GGG CAG ATT TTA GGT GTT GTA 240 
Ala Val Gly Thr Gly lie Ser Val Val Gly Gin He Leu Gly Val Val 
65 70 75 80 

GGA GTT CCA TTT GCT GGG GCA CTC ACT TCA TTT TAT CAA TCA TTT CTT 288 
Gly Val Pro Phe Ala Gly Ala Leu Thr Ser Phe Tyr Gin Ser Phe Leu 
85 90 95 

AAC ACT ATA TGG CCA AGT GAT GCT GAC CCA TGG AAG GCT TTT ATG GCA 3 36 

Asn Thr He Trp Pro Ser Asp Ala Asp Pro Trp Lys Ala Phe Met Ala 
100 * 105 110 

CAA GTT GAA GTA CTG ATA GAT AAG AAA ATA GAG GAG TAT GCT AAA AGT 384 
Gin Val Glu Val Leu He Asp Lys Lys He Glu Glu Tyr Ala Lys Ser 
115 120 125 

AAA GCT CTT GCA GAG TTA CAG GGT CTT CAA AAT AAT TTC GAA GAT TAT 432 
Lys Ala Leu Ala Glu Leu Gin Gly Leu Gin Asn Asn Phe Glu Asp Tyr 
130 135 140 

GTT AAT GCG TTA AAT TCC TGG AAG AAA ACA CCT TTA AGT TTG CGA AGT 4 80 

Val Asn Ala Leu Asn Ser Trp Lys Lys Thr Pro Leu Ser Leu Arg Ser 
145 150 155 160 

AAA AGA AGC CAA GAT CGA ATA AGG GAA CTT TTT TCT CAA GCA GAA AGT 528 
Lys Arg Ser Gin Asp Arg He Arg Glu Leu Phe Ser Gin Ala Glu Ser 
165 170 175 

CAT TTT CGT AAT TCC ATG CCG TCA TTT GCA GTT TCC AAA TTC GAA GTG 576 
His Phe Arg Asn Ser Met Pro Ser Phe Ala Val Ser Lys Phe Glu Val 
180 185 190 

CTG TTT CTA CCA ACA TAT GCA CAA GCT GCA AAT ACA CAT TTA TTG CTA 624 
Leu Phe Leu Pro Thr Tyr Ala Gin Ala Ala Asn Thr His Leu Leu Leu 
195 200 205 

TTA AAA GAT GCT CAA GTT TTT GGA GAA GAA TGG GGA TAT TCT TCA GAA 672 
Leu Lys Asp Ala Gin Val Phe Gly Glu Glu Trp Gly Tyr Ser Ser Glu 
210 215 220 

GAT GTT GCT GAA TTC TAT AAT AGA CAA TTA AAA CTT ACA CAA CAA TAC 720 
Asp Val Ala Glu Phe Tyr Asn Arg Gin Leu Lys Leu Thr Gin Gin Tyr 
225 230 235 240 
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TCT GAC CAT TGT GTT AAT TGG TAT AAT GTT GGA TTA AAT GGT TTA AGA 7 68 

Ser Asp His Cys Val Asn Trp Tyr Asn Val Gly Leu Asn Gly Leu Arg 
245 250 255 

GGT TCA ACT TAT GAT GCA TGG GTC AAA TTT AAC CGT TTT CGC AGA GAA 816 
Gly Ser Thr Tyr Asp Ala Trp Val Lys Phe Asn Arg Phe Arg Arg Glu 
260 265 270 

ATG ACT TTA ACT GTA TTA GAT CTA ATT GTA CTT TTC CCA TTT TAT GAT 864 
Met Thr Leu Thr Val Leu Asp Leu lie Val Leu Phe Pro Phe Tyr Asp 
275 280 285 

ATT CGG TTA TAC TCA AAA GGG GTT AAA ACA GAA CTA ACA AGA GAC ATT 912 
lie Arg Leu Tyr Ser Lys Gly Val Lys Thr Glu Leu Thr Arg Asp lie 
290 295 300 

TTT ACG GAT CCA ATT TTT TCA CTT AAT ACT CTT CAG GAG TAT GGA CCA 960 
Phe Thr Asp Pro lie Phe Ser Leu Asn Thr Leu Gin Glu Tyr Gly Pro 
305 310 315 320 

ACT TTT TTG AGT ATA GAA AAC TCT ATT CGA AAA CCT CAT TTA TTT GAT 1008 
Thr Phe Leu Ser lie Glu Asn Ser lie Arg Lys Pro His Leu Phe Asp 
325 330 335 

TAT TTA CAG GGG ATT GAA TTT CAT ACG CGT CTT CAA CCT GGT TAC TTT 1056 
Tyr Leu Gin Gly lie Glu Phe His Thr Arg Leu Gin Pro Gly Tyr Phe 
340 345 350 

GGG AAA GAT TCT TTC AAT TAT TGG TCT GGT AAT TAT GTA GAA ACT AGA 1104 
Gly Lys Asp Ser Phe Asn Tyr Trp Ser Gly Asn Tyr Val Glu Thr Arg 
355 360 365 

CCT AGT ATA GGA TCT AGT AAG ACA ATT ACT TCC CCA TTT TAT GGA GAT 1152 
Pro Ser lie Gly Ser Ser Lys Thr lie Thr Ser Pro Phe Tyr Gly Asp 
370 375 380 

AAA TCT ACT GAA CCT GTA CAA AAG CTA AGC TTT GAT GGA CAA AAA GTT 1200 
Lys Ser Thr Glu Pro Val Gin Lys Leu Ser Phe Asp Gly Gin Lys Val 
385 390 395 400 

TAT CGA ACT ATA GCT AAT ACA GAC GTA GCG GCT TGG CCG AAT GGT AAG 1248 
Tyr Arg Thr lie Ala Asn Thr Asp Val Ala Ala Trp Pro Asn Gly Lys 
405 410 415 

GTA TAT TTA GGT GTT ACG AAA GTT GAT TTT AGT CAA TAT GAT GAT CAA 12 96 

Val Tyr Leu Gly Val Thr Lys Val Asp Phe Ser Gin Tyr Asp Asp Gin 
420 425 430 

AAA AAT GAA ACT AGT ACA CAA ACA TAT GAT TCA AAA AGA AAC AAT GGC 1344 
Lys Asn Glu Thr Ser Thr Gin Thr Tyr Asp Ser Lys Arg Asn Asn Gly 
435 440 445 

CAT GTA AGT GCA CAG GAT TCT ATT GAC CAA TTA CCG CCA GAA ACA ACA 1392 
His Val Ser Ala Gin Asp Ser lie Asp Gin Leu Pro Pro Glu Thr Thr 
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450 455 460 

GAT GAA CCA CTT GAA AAA GCA TAT AGT CAT CAG CTT AAT TAC GCG GAA 14 4 0 

Asp Glu Pro Leu Glu Lys Ala Tyr Ser His Gin Leu Asn Tyr Ala Glu 
465 470 475 480 

TGT TTC TTA ATG CAG GAC CGT CGT GGA ACA ATT CCA TTT TTT ACT TGG 14 88 

Cys Phe Leu Met Gin Asp Arg Arg Gly Thr lie Pro Phe Phe Thr Trp 
485 490 495 

ACA CAT AGA AGT GTA GAC TTT TTT AAT ACA ATT GAT GCT GAA AAG ATT 1536 
Thr His Arg Ser Val Asp Phe Phe Asn Thr lie Asp Ala Glu Lys lie 
500 505 510 

ACT CAA CTT CCA GTA GTG AAA GCA TAT GCC TTG TCT TCA GGT GCT TCC 1584 
Thr Gin Leu Pro Val Val Lys Ala Tyr Ala Leu Ser Ser Gly Ala Ser 
515 520 525 

ATT ATT GAA GGT CCA GGA TTC ACA GGA GGA AAT TTA CTA TTC CTA AAA 16 32 

lie lie Glu Gly Pro Gly Phe Thr Gly Gly Asn Leu Leu Phe Leu Lys 
530 535 540 

GAA TCT AGT AAT TCA ATT GCT AAA TTT AAA GTT ACA TTA AAT TCA GCA 168 0 

Glu Ser Ser Asn Ser lie Ala Lys Phe Lys Val Thr Leu Asn Ser Ala 
545 550 555 560 

GCC TTG TTA CAA CGA TAT CGT GTA AGA ATA CGC TAT GCT TCT ACC ACT 172 8 

Ala Leu Leu Gin Arg Tyr Arg Val Arg lie Arg Tyr Ala Ser Thr Thr 
565 570 575 

AAC TTA CGA CTT TTT GTG CAA AAT TCA AAC AAT GAT TTT CTT GTC ATC 17 76 

Asn Leu Arg Leu Phe Val Gin Asn Ser Asn Asn Asp Phe Leu Val lie 
580 585 590 

TAC ATT AAT AAA ACT ATG AAT AAA GAT GAT GAT TTA ACA TAT CAA ACA 1824 
Tyr lie Asn Lys Thr Met Asn Lys Asp Asp Asp Leu Thr Tyr Gin Thr 
595 600 605 

TTT GAT CTC GCA ACT ACT AAT TCT AAT ATG GGG TTC TCG GGT GAT AAG 1872 
Phe Asp Leu Ala Thr Thr Asn Ser Asn Met Gly Phe Ser Gly Asp Lys 
610 615 620 

AAT GAA CTT ATA ATA GGA GCA GAA TCT TTC GTT TCT AAT GAA AAA ATC 192 0 

Asn Glu Leu lie lie Gly Ala Glu Ser Phe Val Ser Asn Glu Lys lie 
625 630 635 640 

TAT ATA GAT AAG ATA GAA TTT ATC CCA GTA CAA TTG TAA 1959 
Tyr lie Asp Lys lie Glu Phe lie Pro Val Gin Leu 
645 650 
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(2) INFORMATION FOR SEQ ID NO: 10: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 652 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 

Met Asn Pro Asn Asn Arg Ser Glu His Asp Thr lie Lys Val Thr Pro 
15 10 15 

Asn Ser Glu Leu Gin Thr Asn His Asn Gin Tyr Pro Leu Ala Asp Asn 
20 25 30 

Pro Asn Ser Thr Leu Glu Glu Leu Asn Tyr Lys Glu Phe Leu Arg Met 
35 40 45 

Thr Glu Asp Ser Ser Thr Glu Val Leu Asp Asn Ser Thr Val Lys Asp 
50 55 60 

Ala Val Gly Thr Gly He Ser Val Val Gly Gin He Leu Gly Val Val 
65 70 75 80 

Gly Val Pro Phe Ala Gly Ala Leu Thr Ser Phe Tyr Gin Ser Phe Leu 
85 90 95 

Asn Thr He Trp Pro Ser Asp Ala Asp Pro Trp Lys Ala Phe Met Ala 
100 105 110 

Gin Val Glu Val Leu He Asp Lys Lys He Glu Glu Tyr Ala Lys Ser 
115 120 125 

Lys Ala Leu Ala Glu Leu Gin Gly Leu Gin Asn Asn Phe Glu Asp Tyr 
130 135 140 

Val Asn Ala Leu Asn Ser Trp Lys Lys Thr Pro Leu Ser Leu Arg Ser 
145 150 155 160 

Lys Arg Ser Gin Asp Arg He Arg Glu Leu Phe Ser Gin Ala Glu Ser 
165 170 175 

His Phe Arg Asn Ser Met Pro Ser Phe Ala Val Ser Lys Phe Glu Val 
180 185 190 

Leu Phe Leu Pro Thr Tyr Ala Gin Ala Ala Asn Thr His Leu Leu Leu 
195 200 205 

Leu Lys Asp Ala Gin Val Phe Gly Glu Glu Trp Gly Tyr Ser Ser Glu 
210 215 220 

Asp Val Ala Glu Phe Tyr Asn Arg Gin Leu Lys Leu Thr Gin Gin Tyr 
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225 



230 



235 



240 



Ser Asp His Cys Val Asn Trp Tyr Asn Val Gly Leu Asn Gly Leu Arg 
245 250 255 

Gly Ser Thr Tyr Asp Ala Trp Val Lys Phe Asn Arg Phe Arg Arg Glu 
260 265 270 

Met Thr Leu Thr Val Leu Asp Leu He Val Leu Phe Pro Phe Tyr Asp 
275 280 285 

He Arg Leu Tyr Ser Lys Gly Val Lys Thr Glu Leu Thr Arg Asp He 
290 295 300 

Phe Thr Asp Pro lie Phe Ser Leu Asn Thr Leu Gin Glu Tyr Gly Pro 
305 310 315 320 

Thr Phe Leu Ser He Glu Asn Ser He Arg Lys Pro His Leu Phe Asp 
325 330 335 

Tyr Leu Gin Gly He Glu Phe His Thr Arg Leu Gin Pro Gly Tyr Phe 
340 345 350 

Gly Lys Asp Ser Phe Asn Tyr Trp Ser Gly Asn Tyr Val Glu Thr Arg 
355 360 365 

Pro Ser He Gly Ser Ser Lys Thr He Thr Ser Pro Phe Tyr Gly Asp 
370 375 380 

Lys Ser Thr Glu Pro Val Gin Lys Leu Ser Phe Asp Gly Gin Lys Val 
385 390 395 400 

Tyr Arg Thr He Ala Asn Thr Asp Val Ala Ala Trp Pro Asn Gly Lys 
405 410 415 

Val Tyr Leu Gly Val Thr Lys Val Asp Phe Ser Gin Tyr Asp Asp Gin 
420 425 430 

Lys Asn Glu Thr Ser Thr Gin Thr Tyr Asp Ser Lys Arg Asn Asn Gly 
435 440 445 

His Val Ser Ala Gin Asp Ser He Asp Gin Leu Pro Pro Glu Thr Thr 
450 455 460 

Asp Glu Pro Leu Glu Lys Ala Tyr Ser His Gin Leu Asn Tyr Ala Glu 
465 470 475 480 

Cys Phe Leu Met Gin Asp Arg Arg Gly Thr He Pro Phe Phe Thr Trp 
485 490 * 495 

Thr His Arg Ser Val Asp Phe Phe Asn Thr He Asp Ala Glu Lys He 
500 505 510 
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Thr Gin Leu Pro 
515 

Tie lie Glu Gly 
530 

Glu Ser Ser Asn 
545 

Ala Leu Leu Gin 



Asn Leu Arg Leu 
580 

Tyr lie Asn Lys 
595 

Phe Asp Leu Ala 
610 

Asn Glu Leu lie 
625 

Tyr lie Asp Lys 



Val Val Lys Ala 
520 

Pro Gly Phe Thr 
535 

Ser lie Ala Lys 
550 

Arg Tyr Arg Val 
565 

Phe Val Gin Asn 



Thr Met Asn Lys 
600 

Thr Thr Asn Ser 
615 

He Gly Ala Glu 
630 

He Glu Phe He 
645 



Tyr Ala Leu Ser 



Gly Gly Asn Leu 
540 

Phe Lys Val Thr 
555 

Arg He Arg Tyr 
570 

Ser Asn Asn Asp 
585 

Asp Asp Asp Leu 



Asn Met Gly Phe 
620 

Ser Phe Val Ser 
635 

Pro Val Gin Leu 
650 



Ser Gly Ala Ser 
525 

Leu Phe Leu Lys 



Leu Asn Ser Ala 
560 

Ala Ser Thr Thr 
575 

Phe Leu Val He 
590 

Thr Tyr Gin Thr 
605 

Ser Gly Asp Lys 



Asn Glu Lys He 
640 



(2) INFORMATION FOR SEQ ID NO: 11: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1959 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ix) FEATURE: 

(A) NAME /KEY : CDS 

(B) LOCATION: 1 . . 1956 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 

ATG AAT CCA AAC AAT CGA AGT GAA CAT GAT ACG ATA AAG GTT ACA CCT 
Met Asn Pro Asn Asn Arg Ser Glu His Asp Thr He Lys Val Thr Pro 
15 10 15 

AAC AGT GAA TTG CAA ACT AAC CAT AAT CAA TAT CCT TTA GCT GAC AAT 
Asn Ser Glu Leu Gin Thr Asn His Asn Gin Tyr Pro Leu Ala Asp Asn 
20 25 30 

CCA AAT TCA ACA CTA GAA GAA TTA AAT TAT AAA GAA TTT TTA AGA ATG 
Pro Asn Ser Thr Leu Glu Glu Leu Asn Tyr Lys Glu Phe Leu Arg Met 
35 40 45 
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ACT GAA GAC AGT TCT ACG GAA GTG CTA GAC AAC TCT AC A GTA AAA GAT 192 
Thr Glu Asp Ser Ser Thr Glu Val Leu Asp Asn Ser Thr Val Lys Asp 
50 55 60 

GCA GTT GGG ACA GGA ATT TCT GTT GTA GGG CAG ATT TTA GGT GTT GTA 24 0 

Ala Val Gly Thr Gly He Ser Val Val Gly Gin He Leu Gly Val Val 
65 70 75 80 

GGA GTT CCA TTT GCT GGG GCA CTC ACT TCA TTT TAT CAA TCA TTT CTT 2 88 

Gly Val Pro Phe Ala Gly Ala Leu Thr Ser Phe Tyr Gin Ser Phe Leu 
85 90 95 

AAC ACT ATA TGG CCA AGT GAT GCT GAC CCA TGG AAG GCT TTT ATG GCA 3 36 

Asn Thr He Trp Pro Ser Asp Ala Asp Pro Trp Lys Ala Phe Met Ala 
100 105 110 

CAA GTT GAA GTA CTG ATA GAT AAG AAA ATA GAG GAG TAT GCT AAA AGT 3 84 

Gin Val Glu Val Leu He Asp Lys Lys He Glu Glu Tyr Ala Lys Ser 
115 120 125 

AAA GCT CTT GCA GAG TTA CAG GGT CTT CAA AAT AAT TTC GAA GAT TAT 432 
Lys Ala Leu Ala Glu Leu Gin Gly Leu Gin Asn Asn Phe Glu Asp Tyr 
130 135 140 

GTT AAT GCG TTA AAT TCC TGG AAG AAA ACA CCT TTA AGT TTG CGA AGT 480 
Val Asn Ala Leu Asn Ser Trp Lys Lys Thr Pro Leu Ser Leu Arg Ser 
145 150 155 160 

AAA AGA AGC CAA GAT CGA ATA AGG GAA CTT TTT TCT CAA GCA GAA AGT 52 8 

Lys Arg Ser Gin Asp Arg He Arg Glu Leu Phe Ser Gin Ala Glu Ser 
165 170 175 

CAT TTT CGT AAT TCC ATG CCG TCA TTT GCA GTT TCC AAA TTC GAA GTG 576 
His Phe Arg Asn Ser Met Pro Ser Phe Ala Val Ser Lys Phe Glu Val 
180 185 190 

CTG TTT CTA CCA ACA TAT GCA CAA GCT GCA AAT ACA CAT TTA TTG CTA 624 
Leu Phe Leu Pro Thr Tyr Ala Gin Ala Ala Asn Thr His Leu Leu Leu 
195 200 205 

TTA AAA GAT GCT CAA GTT TTT GGA GAA GAA TGG GGA TAT TCT TCA GAA 672 
Leu Lys Asp Ala Gin Val Phe Gly Glu Glu Trp Gly Tyr Ser Ser Glu 
210 215 220 

GAT GTT GCT GAA TTC TAT ACC AGA CAA TTA AAA CTT ACA CAA CAA TAC 720 
Asp Val Ala Glu Phe Tyr Thr Arg Gin Leu Lys Leu Thr Gin Gin Tyr 
225 230 235 240 

ACT GAC CAT TGT GTT AAT TGG TAT AAT GTT GGA TTA AAT GGT TTA AGA 768 
Thr Asp His Cys Val Asn Trp Tyr Asn Val Gly Leu Asn Gly Leu Arg 
245 250 255 
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GGT TCA ACT TAT GAT GCA TGG GTC AAA TTT AAC CGT TTT CGC AG A GAA 316 
Gly Ser Thr Tyr Asp Ala Trp Val Lys Phe Asn Arg Phe Arg Arg Glu 
260 265 270 

ATG ACT TTA ACT GTA TTA GAT CTA ATT GTA CTT TTC CCA TTT TAT GAT 864 
Met Thr Leu Thr Val Leu Asp Leu lie Val Leu Phe Pro Phe Tyr Asp 
275 280 285 

ATT CGG TTA TAC TCA AAA GGG GTT AAA AC A GAA CTA ACA AGA GAC ATT 912 
lie Arg Leu Tyr Ser Lys Gly Val Lys Thr Glu Leu Thr Arg Asp lie 
290 295 300 

TTT ACG GAT CCA ATT TTT TCA CTT AAT ACT CTT CAG GAG TAT GGA CCA 96 0 

Phe Thr Asp Pro lie Phe Ser Leu Asn Thr Leu Gin Glu Tyr Gly Pro 
305 310 315 320 

ACT TTT TTG AGT ATA GAA AAC TCT ATT CGA AAA CCT CAT TTA TTT GAT 1008 
Thr Phe Leu Ser lie Glu Asn Ser lie Arg Lys Pro His Leu Phe Asp 
325 330 335 

TAT TTA CAG GGG ATT GAA TTT CAT ACG CGT CTT CAA CCT GGT TAC TTT 1056 
Tyr Leu Gin Gly lie Glu Phe His Thr Arg Leu Gin Pro Gly Tyr Phe 
340 345 350 

GGG AAA GAT TCT TTC AAT TAT TGG TCT GGT AAT TAT GTA GAA ACT AGA 1104 
Gly Lys Asp Ser Phe Asn Tyr Trp Ser Gly Asn Tyr Val Glu Thr Arg 
355 360 365 

CCT AGT ATA GGA TCT AGT AAG ACA ATT ACT TCC CCA TTT TAT GGA GAT 1152 
Pro Ser lie Gly Ser Ser Lys Thr lie Thr Ser Pro Phe Tyr Gly Asp 
370 375 380 

AAA TCT ACT GAA CCT GTA CAA AAG CTA AGC TTT GAT GGA CAA AAA GTT 12 00 

Lys Ser Thr Glu Pro Val Gin Lys Leu Ser Phe Asp Gly Gin Lys Val 
385 390 395 400 

TAT CGA ACT ATA GCT AAT ACA GAC GTA GCG GCT TGG CCG AAT GGT AAG 124 8 

Tyr Arg Thr lie Ala Asn Thr Asp Val Ala Ala Trp Pro Asn Gly Lys 
405 410 415 

GTA TAT TTA GGT GTT ACG AAA GTT GAT TTT AGT CAA TAT GAT GAT CAA 12 96 

Val Tyr Leu Gly Val Thr Lys Val Asp Phe Ser Gin Tyr Asp Asp Gin 
420 425 430 

AAA AAT GAA ACT AGT ACA CAA ACA TAT GAT TCA AAA AGA AAC AAT GGC 13 44 

Lys Asn Glu Thr Ser Thr Gin Thr Tyr Asp Ser Lys Arg Asn Asn Gly 
435 440 445 

CAT GTA AGT GCA CAG GAT TCT ATT GAC CAA TTA CCG CCA GAA ACA ACA 13 92 

His Val Ser Ala Gin Asp Ser lie Asp Gin Leu Pro Pro Glu Thr Thr 
450 455 460 

GAT GAA CCA CTT GAA AAA GCA TAT AGT CAT CAG CTT AAT TAC GCG GAA 14 4 0 

Asp Glu Pro Leu Glu Lys Ala Tyr Ser His Gin Leu Asn Tyr Ala Glu 
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465 



470 



475 



480 



TGT TTC TTA ATG CAG GAC CGT CGT GGA ACA ATT CCA TTT TTT ACT TGG 14 8 8 

Cys Phe Leu Met Gin Asp Arg Arg Gly Thr lie Pro Phe Phe Thr Trp 
485 490 495 

ACA CAT AGA AGT GTA GAC TTT TTT AAT ACA ATT GAT GCT GAA AAG ATT 153 6 

Thr His Arg Ser Val Asp Phe Phe Asn Thr lie Asp Ala Glu Lys lie 
500 505 510 

ACT CAA CTT CCA GTA GTG AAA GCA TAT GCC TTG TCT TCA GGT GCT TCC 1584 
Thr Gin Leu Pro Val Val Lys Ala Tyr Ala Leu Ser Ser Gly Ala Ser 
515 520 525 

ATT ATT GAA GGT CCA GGA TTC ACA GGA GGA AAT TTA CTA TTC CTA AAA 163 2 

lie lie Glu Gly Pro Gly Phe Thr Gly Gly Asn Leu Leu Phe Leu Lys 
530 535 540 

GAA TCT AGT AAT TCA ATT GCT AAA TTT AAA GTT ACA TTA AAT TCA GCA 1680 
Glu Ser Ser Asn Ser lie Ala Lys Phe Lys Val Thr Leu Asn Ser Ala 
545 550 555 560 

GCC TTG TTA CAA CGA TAT CGT GTA AGA ATA CGC TAT GCT TCT ACC ACT 1728 
Ala Leu Leu Gin Arg Tyr Arg Val Arg lie Arg Tyr Ala Ser Thr Thr 
565 570 575 

AAC TTA CGA CTT TTT GTG CAA AAT TCA AAC AAT GAT TTT CTT GTC ATC 1776 
Asn Leu Arg Leu Phe Val Gin Asn Ser Asn Asn Asp Phe Leu Val lie 
580 585 590 

TAC ATT AAT AAA ACT ATG AAT AAA GAT GAT GAT TTA ACA TAT CAA ACA 1824 
Tyr lie Asn Lys Thr Met Asn Lys Asp Asp Asp Leu Thr Tyr Gin Thr 
595 600 605 

TTT GAT CTC GCA ACT ACT AAT TCT AAT ATG GGG TTC TCG GGT GAT AAG 1872 
Phe Asp Leu Ala Thr Thr Asn Ser Asn Met Gly Phe Ser Gly Asp Lys 
610 615 620 

AAT GAA CTT ATA ATA GGA GCA GAA TCT TTC GTT TCT AAT GAA AAA ATC 192 0 

Asn Glu Leu lie lie Gly Ala Glu Ser Phe Val Ser Asn Glu Lys He 
625 630 635 640 

TAT ATA GAT AAG ATA GAA TTT ATC CCA GTA CAA TTG TAA 195 9 

Tyr He Asp Lys lie Glu Phe He Pro Val Gin Leu 
645 650 



(2) INFORMATION FOR SEQ ID NO: 12: 

<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 652 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 

Met Asn Pro Asn Asn Arg Ser Glu His Asp Thr lie Lys Val Thr Pro 
15 10 15 

Asn Ser Glu Leu Gin Thr Asn His Asn Gin Tyr Pro Leu Ala Asp Asn 
20 25 30 

Pro Asn Ser Thr Leu Glu Glu Leu Asn Tyr Lys Glu Phe Leu Arg Met 
35 40 45 

Thr Glu Asp Ser Ser Thr Glu Val Leu Asp Asn Ser Thr Val Lys Asp 
50 55 60 

Ala Val Gly Thr Gly He Ser Val Val Gly Gin He Leu Gly Val Val 
65 70 75 80 

Gly Val Pro Phe Ala Gly Ala Leu Thr Ser Phe Tyr Gin Ser Phe Leu 
85 90 95 

Asn Thr He Trp Pro Ser Asp Ala Asp Pro Trp Lys Ala Phe Met Ala 
100 105 110 

Gin Val Glu Val Leu He Asp Lys Lys He Glu Glu Tyr Ala Lys Ser 
115 120 125 

Lys Ala Leu Ala Glu Leu Gin Gly Leu Gin Asn Asn Phe Glu Asp Tyr 
130 135 140 

Val Asn Ala Leu Asn Ser Trp Lys Lys Thr Pro Leu Ser Leu Arg Ser 
145 150 155 160 

Lys Arg Ser Gin Asp Arg He Arg Glu Leu Phe Ser Gin Ala Glu Ser 
165 170 175 

His Phe Arg Asn Ser Met Pro Ser Phe Ala Val Ser Lys Phe Glu Val 
180 185 190 

Leu Phe Leu Pro Thr Tyr Ala Gin Ala Ala Asn Thr His Leu Leu Leu 
195 200 205 

Leu Lys Asp Ala Gin Val Phe Gly Glu Glu Trp Gly Tyr Ser Ser Glu 
210 215 220 

Asp Val Ala Glu Phe Tyr Thr Arg Gin Leu Lys Leu Thr Gin Gin Tyr 
225 230 235 240 

Thr Asp His Cys Val Asn Trp Tyr Asn Val Gly Leu Asn Gly Leu Arg 
245 250 255 

Gly Ser Thr Tyr Asp Ala Trp Val Lys Phe Asn Arg Phe Arg Arg Glu 
260 265 270 
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Met Thr Leu Thr Val Leu Asp Leu lie Val Leu Phe Pro Phe Tyr Asp 
275 280 285 



lie Arg Leu Tyr Ser Lys Gly Val Lys Thr Glu Leu Thr Arg Asp lie 
290 295 300 

Phe Thr Asp Pro lie Phe Ser Leu Asn Thr Leu Gin Glu Tyr Gly Pro 
305 310 315 320 

Thr Phe Leu Ser lie Glu Asn Ser lie Arg Lys Pro His Leu Phe Asp 
325 330 335 

Tyr Leu Gin Gly lie Glu Phe His Thr Arg Leu Gin Pro Gly Tyr Phe 
340 345 350 

Gly Lys Asp Ser Phe Asn Tyr Trp Ser Gly Asn Tyr Val Glu Thr Arg 
355 360 365 

Pro Ser He Gly Ser Ser Lys Thr He Thr Ser Pro Phe Tyr Gly Asp 
370 375 380 

Lys Ser Thr Glu Pro Val Gin Lys Leu Ser Phe Asp Gly Gin Lys Val 
385 390 395 400 

Tyr Arg Thr He Ala Asn Thr Asp Val Ala Ala Trp Pro Asn Gly Lys 
405 410 415 

Val Tyr Leu Gly Val Thr Lys Val Asp Phe Ser Gin Tyr Asp Asp Gin 
420 425 430 

Lys Asn Glu Thr Ser Thr Gin Thr Tyr Asp Ser Lys Arg Asn Asn Gly 
435 440 445 

His Val Ser Ala Gin Asp Ser He Asp Gin Leu Pro Pro Glu Thr Thr 
450 455 460 

Asp Glu Pro Leu Glu Lys Ala Tyr Ser His Gin Leu Asn Tyr Ala Glu 
465 470 475 480 

Cys Phe Leu Met Gin Asp Arg Arg Gly Thr He Pro Phe Phe Thr Trp 
485 490 495 

Thr His Arg Ser Val Asp Phe Phe Asn Thr He Asp Ala Glu Lys He 
500 505 510 

Thr Gin Leu Pro Val Val Lys Ala Tyr Ala Leu Ser Ser Gly Ala Ser 
515 520 525 

He He Glu Gly Pro Gly Phe Thr Gly Gly Asn Leu Leu Phe Leu Lys 
530 535 540 

Glu Ser Ser Asn Ser He Ala Lys Phe Lys Val Thr Leu Asn Ser Ala 
545 550 555 560 
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Ala Leu Leu Gin 



Asn Leu Arg Leu 
580 

Tyr lie Asn Lys 
595 

Phe Asp Leu Ala 
610 

Asn Glu Leu lie 
625 

Tyr lie Asp Lys 



Arg Tyr Arg Val 
565 

Phe Val Gin Asn 



Thr Met Asn Lys 
600 

Thr Thr Asn Ser 
615 

He Gly Ala Glu 
630 

He Glu Phe He 
645 



Arg He Arg Tyr 
570 

Ser Asn Asn Asp 
585 

Asp Asp Asp Leu 



Asn Met Gly Phe 
620 

Ser Phe Val Ser 
635 

Pro Val Gin Leu 
650 



Ala Ser Thr Thr 
575 

Phe Leu Val lie 
590 

Thr Tyr Gin Thr 
605 

Ser Gly Asp Lys 



Asn Glu Lys lie 
640 



(2) INFORMATION FOR SEQ ID NO: 13: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1959 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ix) FEATURE: 

(A) NAME /KEY : CDS 

(B) LOCATION: 1. .1956 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:13: 

ATG AAT CCA AAC AAT CGA AGT GAA CAT GAT ACG ATA AAG GTT ACA CCT 4 8 

Met Asn Pro Asn Asn Arg Ser Glu His Asp Thr He Lys Val Thr Pro 
15 10 15 

AAC AGT GAA TTG CAA ACT AAC CAT AAT CAA TAT CCT TTA GCT GAC AAT 96 
Asn Ser Glu Leu Gin Thr Asn His Asn Gin Tyr Pro Leu Ala Asp Asn 
20 25 30 

CCA AAT TCA ACA CTA GAA GAA TTA AAT TAT AAA GAA TTT TTA AGA ATG 144 
Pro Asn Ser Thr Leu Glu Glu Leu Asn Tyr Lys Glu Phe Leu Arg Met 
35 40 45 

ACT GAA GAC AGT TCT ACG GAA GTG CTA GAC AAC TCT ACA GTA AAA GAT 192 
Thr Glu Asp Ser Ser Thr Glu Val Leu Asp Asn Ser Thr Val Lys Asp 
50 55 60 

GCA GTT GGG ACA GGA ATT TCT GTT GTA GGG CAG ATT TTA GGT GTT GTA 24 0 

Ala Val Gly Thr Gly He Ser Val Val Gly Gin He Leu Gly Val Val 
65 70 75 80 
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GGA GTT CCA TTT GCT GGG GCA CTC ACT TCA TTT TAT CAA TCA TTT CTT 2 88 

Gly Val Pro Phe Ala Gly Ala Leu Thr Ser Phe Tyr Gin Ser Phe Leu 
85 90 95 

AAC ACT ATA - TGG CCA AGT GAT GCT GAC CCA TGG AAG GCT TTT ATG GCA 3 36 

Asn Thr lie Trp Pro Ser Asp Ala Asp Pro Trp Lys Ala Phe Met Ala 
100 105 110 

CAA GTT GAA GTA CTG ATA GAT AAG AAA ATA GAG GAG TAT GCT AAA AGT 384 
Gin Val Glu Val Leu lie Asp Lys Lys lie Glu Glu Tyr Ala Lys Ser 
115 120 125 

AAA GCT CTT GCA GAG TTA CAG GGT CTT CAA AAT AAT TTC GAA GAT TAT 4 32 

Lys Ala Leu Ala Glu Leu Gin Gly Leu Gin Asn Asn Phe Glu Asp Tyr 
130 135 140 

GTT AAT GCG TTA AAT TCC TGG AAG AAA ACA CCT TTA AGT TTG CGA AGT 4 80 

Val Asn Ala Leu Asn Ser Trp Lys Lys Thr Pro Leu Ser Leu Arg Ser 
145 150 155 160 

AAA AGA AGC CAA GAT CGA ATA AGG GAA CTT TTT TCT CAA GCA GAA AGT 528 
Lys Arg Ser Gin Asp Arg lie Arg Glu Leu Phe Ser Gin Ala Glu Ser 
165 170 175 

CAT TTT CGT AAT TCC ATG CCG TCA TTT GCA GTT TCC AAA TTC GAA GTG 576 
His Phe Arg Asn Ser Met Pro Ser Phe Ala Val Ser Lys Phe Glu Val 
180 185 190 

CTG TTT CTA CCA ACA TAT GCA CAA GCT GCA AAT ACA CAT TTA TTG CTA 624 
Leu Phe Leu Pro Thr Tyr Ala Gin Ala Ala Asn Thr His Leu Leu Leu 
195 200 205 

TTA AAA GAT GCT CAA GTT TTT GGA GAA GAA TGG GGA TAT TCT TCA GAA 6 72 

Leu Lys Asp Ala Gin Val Phe Gly Glu Glu Trp Gly Tyr Ser Ser Glu 
210 215 220 

GAT GTT GCT GAA TTT TAT CAT AGA CAA TTA AAA CTT ACA CAA CAA TAC 720 
Asp Val Ala Glu Phe Tyr His Arg Gin Leu Lys Leu Thr Gin Gin Tyr 
225 230 235 240 

ACT GAC CAT TGT GTT AAT TGG TAT AAT GTT GGA TTA AAT GGT TTA AGA 768 
Thr Asp His Cys Val Asn Trp Tyr Asn Val Gly Leu Asn Gly Leu Arg 
245 250 255 

GGT TCA ACT TAT GAT GCA TGG GTC AAA TTT AAC CGT TTT CGC AGA GAA 816 
Gly Ser Thr Tyr Asp Ala Trp Val Lys Phe Asn Arg Phe Arg Arg Glu 
260 265 270 

ATG ACT TTA ACT GTA TTA GAT CTA ATT GTA CTT TTC CCA TTT TAT GAT 864 
Met Thr Leu Thr Val Leu Asp Leu lie Val Leu Phe Pro Phe Tyr Asp. 
275 280 285 
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ATT AAT TTA TAC TCA AAA GGG GTT AAA ACA GAA CTA ACA AGA GAC ATT 912 
lie Asn Leu Tyr Ser Lys Gly Val Lys Thr Glu Leu Thr Arg Asp lie 
290 295 300 

TTT ACG GAT CCA ATT TTT TCA CTT AAT ACT CTT CAG GAG TAT GGA CCA 960 
Phe Thr Asp Pro lie Phe Ser Leu Asn Thr Leu Gin Glu Tyr Gly Pro 
305 310 315 320 

ACT TTT TTG AGT ATA GAA AAC TCT ATT CGA AAA CCT CAT TTA TTT GAT 1008 
Thr Phe Leu Ser lie Glu Asn Ser lie Arg Lys Pro His Leu Phe Asp 
325 330 335 

TAT TTA CAG GGG ATT GAA TTT CAT ACG CGT CTT CAA CCT GGT TAC TTT 1056 
Tyr Leu Gin Gly lie Glu Phe His Thr Arg Leu Gin Pro Gly Tyr Phe 
340 345 350 

GGG AAA GAT TCT TTC AAT TAT TGG TCT GGT AAT TAT GTA GAA ACT AGA 1104 
Gly Lys Asp Ser Phe Asn Tyr Trp Ser Gly Asn Tyr Val Glu Thr Arg 
355 360 365 

CCT AGT ATA GGA TCT AGT AAG ACA ATT ACT TCC CCA TTT TAT GGA GAT 1152 
Pro Ser lie Gly Ser Ser Lys Thr lie Thr Ser Pro Phe Tyr Gly Asp 
370 375 380 

AAA TCT ACT GAA CCT GTA CAA AAG CTA AGC TTT GAT GGA CAA AAA GTT 1200 
Lys Ser Thr Glu Pro Val Gin Lys Leu Ser Phe Asp Gly Gin Lys Val 
385 390 395 400 

TAT CGA ACT ATA GCT AAT ACA GAC GTA GCG GCT TGG CCG AAT GGT AAG 124 8 

Tyr Arg Thr He Ala Asn Thr Asp Val Ala Ala Trp Pro Asn Gly Lys 
405 410 415 

GTA TAT TTA GGT GTT ACG AAA GTT GAT TTT AGT CAA TAT GAT GAT CAA 12 96 

Val Tyr Leu Gly Val Thr Lys Val Asp Phe Ser Gin Tyr Asp Asp Gin 
420 425 430 

AAA AAT GAA ACT AGT ACA CAA ACA TAT GAT TCA AAA AGA AAC AAT GGC 1344 
Lys Asn Glu Thr Ser Thr Gin Thr Tyr Asp Ser Lys Arg Asn Asn Gly 
435 440 445 

CAT GTA AGT GCA CAG GAT TCT ATT GAC CAA TTA CCG CCA GAA ACA ACA 13 92 

His Val Ser Ala Gin Asp Ser He Asp Gin Leu Pro Pro Glu Thr Thr 
450 455 460 

GAT GAA CCA CTT GAA AAA GCA TAT AGT CAT CAG CTT AAT TAC GCG GAA 144 0 

Asp Glu Pro Leu Glu Lys Ala Tyr Ser His Gin Leu Asn Tyr Ala Glu 
465 470 475 480 

TGT TTC TTA ATG CAG GAC CGT CGT GGA ACA ATT CCA TTT TTT ACT TGG 148 8 

Cys Phe Leu Met Gin Asp Arg Arg Gly Thr He Pro Phe Phe Thr Trp 
485 490 495 
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ACA CAT AGA AGT GTA GAC TTT TTT AAT ACA ATT GAT GCT GAA AAG ATT 15 36 

Thr His Arg Ser Val Asp Phe Phe Asn Thr lie Asp Ala Glu Lys lie 
500 505 510 

ACT CAA CTT CCA GTA GTG AAA GCA TAT GCC TTG TCT TCA GGT GCT TCC 1584 
Thr Gin Leu Pro Val Val Lys Ala Tyr Ala Leu Ser Ser Gly Ala Ser 
515 520 525 

ATT ATT GAA GGT CCA GGA TTC ACA GGA GGA AAT TTA CTA TTC CTA AAA 16 3 2 

lie lie Glu Gly Pro Gly Phe Thr Gly Gly Asn Leu Leu Phe Leu Lys 
530 535 540 

GAA TCT AGT AAT TCA ATT GCT AAA TTT AAA GTT ACA TTA AAT TCA GCA 1680 
Glu Ser Ser Asn Ser He Ala Lys Phe Lys Val Thr Leu Asn Ser Ala 
545 550 555 560 

GCC TTG TTA CAA CGA TAT CGT GTA AGA ATA CGC TAT GCT TCT ACC ACT 1728 
• Ala Leu Leu Gin Arg Tyr Arg Val Arg lie Arg Tyr Ala Ser Thr Thr 
565 570 575 

AAC TTA CGA CTT TTT GTG CAA AAT TCA AAC AAT GAT TTT CTT GTC ATC 17 76 

Asn Leu Arg Leu Phe Val Gin Asn Ser Asn Asn Asp Phe Leu Val He 
580 585 590 

TAC ATT AAT AAA ACT ATG AAT AAA GAT GAT GAT TTA ACA TAT CAA ACA 1824 
Tyr He Asn Lys Thr Met Asn Lys Asp Asp Asp Leu Thr Tyr Gin Thr 
595 600 605 

TTT GAT CTC GCA ACT ACT AAT TCT AAT ATG GGG TTC TCG GGT GAT AAG 18 72 

Phe Asp Leu Ala Thr Thr Asn Ser Asn Met Gly Phe Ser Gly Asp Lys 
610 615 620 

AAT GAA CTT ATA ATA GGA GCA GAA TCT TTC GTT TCT AAT GAA AAA ATC 1920 
Asn Glu Leu He He Gly Ala Glu Ser Phe Val Ser Asn Glu Lys He 
625 630 635 640 

TAT ATA GAT AAG ATA GAA TTT ATC CCA GTA CAA TTG TAA 195 9 

Tyr He Asp Lys He Glu Phe He Pro Val Gin Leu 
645 650 



(2) INFORMATION FOR SEQ ID NO: 14; 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6 52 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 

Met Asn Pro Asn Asn Arg Ser Glu His Asp Thr He Lys Val Thr Pro 
15 10 15 
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Asn Ser Glu Leu Gin Thr Asn His Asn Gin Tyr Pro Leu Ala Asp Asn 
20 25 30 



Pro Asn Ser Thr Leu Glu Glu Leu Asn Tyr Lys Glu Phe Leu Arg Met 
35 40 45 

Thr Glu Asp Ser Ser Thr Glu Val Leu Asp Asn Ser Thr Val Lys Asp 
50 55 60 

Ala Val Gly Thr Gly He Ser Val Val Gly Gin He Leu Gly Val Val 
65 70 75 "80 

Gly Val Pro Phe Ala Gly Ala Leu Thr Ser Phe Tyr Gin Ser Phe Leu 
85 90 95 

Asn Thr He Trp Pro Ser Asp Ala Asp Pro Trp Lys Ala Phe Met Ala 
100 105 110 

Gin Val Glu Val Leu He Asp Lys Lys He Glu Glu Tyr Ala Lys Ser 
115 120 125 

Lys Ala Leu Ala Glu Leu Gin Gly Leu Gin Asn Asn Phe Glu Asp Tyr 
130 135 140 

Val Asn Ala Leu Asn Ser Trp Lys Lys Thr Pro Leu Ser Leu Arg Ser 
145 150 155 160 

Lys Arg Ser Gin Asp Arg He Arg Glu Leu Phe Ser Gin Ala Glu Ser 
165 170 175 

His Phe Arg Asn Ser Met Pro Ser Phe Ala Val Ser Lys Phe Glu Val 
180 185 190 

Leu Phe Leu Pro Thr Tyr Ala Gin Ala Ala Asn Thr His Leu Leu Leu 
195 200 205 

Leu Lys Asp Ala Gin Val Phe Gly Glu Glu Trp Gly Tyr Ser Ser Glu 
210 215 220 

Asp Val Ala Glu Phe Tyr His Arg Gin Leu Lys Leu Thr Gin Gin Tyr 
225 230 235 240 

Thr Asp His Cys Val Asn Trp Tyr Asn Val Gly Leu Asn Gly Leu Arg 
245 250 255 

Gly Ser Thr Tyr Asp Ala Trp Val Lys Phe Asn Arg Phe Arg Arg Glu 
260 265 270 

Met Thr Leu Thr Val Leu Asp Leu He Val Leu Phe Pro Phe Tyr Asp 
275 280 285 

He Asn Leu Tyr Ser Lys Gly Val Lys Thr Glu Leu Thr Arg Asp lie 
290 295 300 
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Phe Thr Asp Pro lie Phe Ser Leu Asn Thr Leu Gin Glu Tyr Gly Pro 
305 310 315 320 



Thr Phe Leu Ser lie Glu Asn Ser lie Arg Lys Pro His Leu Phe Asp 
325 330 335 

Tyr Leu Gin Gly lie Glu Phe His Thr Arg Leu Gin Pro Gly Tyr Phe 
340 345 350 

Gly Lys Asp Ser Phe Asn Tyr Trp Ser Gly Asn Tyr Val Glu Thr Arg 
355 360 365 

Pro Ser lie Gly Ser Ser Lys Thr lie Thr Ser Pro Phe Tyr Gly Asp 
370 375 380 

Lys Ser Thr Glu Pro Val Gin Lys Leu Ser Phe Asp Gly Gin Lys Val 
385 390 395 400 

Tyr Arg Thr lie Ala Asn Thr Asp Val Ala Ala Trp Pro Asn Gly Lys 
405 410 415 

Val Tyr Leu Gly Val Thr Lys Val Asp Phe Ser Gin Tyr Asp Asp Gin 
420 425 430 

Lys Asn Glu Thr Ser Thr Gin Thr Tyr Asp Ser Lys Arg Asn Asn Gly 
435 440 445 

His Val Ser Ala Gin Asp Ser lie Asp Gin Leu Pro Pro Glu Thr Thr 
450 455 460 

Asp Glu Pro Leu Glu Lys Ala Tyr Ser His Gin Leu Asn Tyr Ala Glu 
465 470 475 480 

Cys Phe Leu Met Gin Asp Arg Arg Gly Thr lie Pro Phe Phe Thr Trp 
485 490 495 

Thr His Arg Ser Val Asp Phe Phe Asn Thr lie Asp Ala Glu Lys lie 
500 505 510 

Thr Gin Leu Pro Val Val Lys Ala Tyr Ala Leu Ser Ser Gly Ala Ser 
515 520 525 

He He Glu Gly Pro Gly Phe Thr Gly Gly Asn Leu Leu Phe Leu Lys 
530 535 540 

Glu Ser Ser Asn Ser He Ala Lys Phe Lys Val Thr Leu Asn Ser Ala 
545 550 555 560 

Ala Leu Leu Gin Arg Tyr Arg Val Arg He Arg Tyr Ala Ser Thr Thr 
565 570 575 

Asn Leu Arg Leu Phe Val Gin Asn Ser Asn Asn Asp Phe Leu Val He 
580 585 590 
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Tyr lie Asn Lys 
595 

Phe Asp Leu Ala 
610 

Asn Glu Leu lie 
625 

Tyr lie Asp Lys 



Thr Met Asn Lys 
600 

Thr Thr Asn Ser 
615 

He Gly Ala Glu 
630 

He Glu Phe He 
645 



Asp Asp Asp Leu 



Asn Met Gly Phe 
620 

Ser Phe Val Ser 
635 

Pro Val Gin Leu 
650 



Thr Tyr Gin Thr 
605 

Ser Gly Asp Lys 



Asn Glu Lys He 
640 



(2) INFORMATION FOR SEQ ID NO: 15: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 1959 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ix) FEATURE: 

(A) NAME / KEY : CDS 

(B) LOCATION: 1. .1956 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 

ATG AAT CCA AAC AAT CGA AGT GAA CAT GAT ACG ATA AAG GTT ACA CCT 
Met Asn Pro Asn Asn Arg Ser Glu His Asp Thr He Lys Val Thr Pro 
15 10 15 

AAC AGT GAA TTG CAA ACT AAC CAT AAT CAA TAT CCT TTA GCT GAC AAT 
Asn Ser Glu Leu Gin Thr Asn His Asn Gin Tyr Pro Leu Ala Asp Asn 
20 25 30 

CCA AAT TCA ACA CTA GAA GAA TTA AAT TAT AAA GAA TTT TTA AGA ATG 
Pro Asn Ser Thr Leu Glu Glu Leu Asn Tyr Lys Glu Phe Leu Arg Met 
35 40 45 

ACT GAA GAC AGT TCT ACG GAA GTG CTA GAC AAC TCT ACA GTA AAA GAT 
Thr Glu Asp Ser Ser Thr Glu Val Leu Asp Asn Ser Thr Val Lys Asp 
50 55 60 

GCA GTT GGG ACA GGA ATT TCT GTT GTA GGG CAG ATT TTA GGT GTT GTA 
Ala Val Gly Thr Gly He Ser Val Val Gly Gin He Leu Gly Val Val 
65 70 75 80 

GGA GTT CCA TTT GCT GGG GCA CTC ACT TCA TTT TAT CAA TCA TTT CTT 
Gly Val Pro Phe Ala Gly Ala Leu Thr Ser Phe Tyr Gin Ser Phe Leu 
85 90 95 
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AAC ACT ATA TGG CCA AGT GAT GCT GAC CCA TGG AAG GCT TTT ATG GCA 
Asn Thr lie Trp Pro Ser Asp Ala Asp Pro Trp Lys Ala Phe Met Ala 
100 105 110 

CAA GTT GAA GTA CTG ATA GAT AAG AAA ATA GAG GAG TAT GCT AAA AGT 
Gin Val Glu Val Leu lie Asp Lys Lys lie Glu Glu Tyr Ala Lys Ser 
115 120 125 

AAA GCT CTT GCA GAG TTA CAG GGT CTT CAA AAT AAT TTC GAA GAT TAT 
Lys Ala Leu Ala Glu Leu Gin Gly Leu Gin Asn Asn Phe Glu Asp Tyr 
130 135 140 

GTT AAT GCG TTA AAT TCC TGG AAG AAA ACA CCT TTA AGT TTG CGA AGT 
Val Asn Ala Leu Asn Ser Trp Lys Lys Thr Pro Leu Ser Leu Arg Ser 
145 150 155 160 

AAA AGA AGC CAA GAT CGA ATA AGG GAA CTT TTT TCT CAA GCA GAA AGT 
Lys Arg Ser Gin Asp Arg lie Arg Glu Leu Phe Ser Gin Ala Glu Ser 
165 170 175 

CAT TTT CGT AAT TCC ATG CCG TCA TTT GCA GTT TCC AAA TTC GAA GTG 
His Phe Arg Asn Ser Met Pro Ser Phe Ala Val Ser Lys Phe Glu Val 
180 185 190 

CTG TTT CTA CCA ACA TAT GCA CAA GCT GCA AAT ACA CAT TTA TTG CTA 
Leu Phe Leu Pro Thr Tyr Ala Gin Ala Ala Asn Thr His Leu Leu Leu 
195 200 205 

TTA AAA GAT GCT CAA GTT TTT GGA GAA GAA TGG GGA TAT TCT TCA GAA 
Leu Lys Asp Ala Gin Val Phe Gly Glu Glu Trp Gly Tyr Ser Ser Glu 
210 215 220 

GAT GTT GCT GAA TTT TAT CAT AGA CAA TTA AAA CTT ACA CAA CAA TAC 
Asp Val Ala Glu Phe Tyr His Arg Gin Leu Lys Leu Thr Gin Gin Tyr 
225 230 235 240 

ACT GAC CAT TGT GTT AAT TGG TAT AAT GTT GGA TTA AAT GGT TTA AGA 
Thr Asp His Cys Val Asn Trp Tyr Asn Val Gly Leu Asn Gly Leu Arg. 

245 250 255 

GGT TCA ACT TAT GAT GCA TGG GTC AAA TTT AAC CGT TTT CGC AGA GAA 
Gly Ser Thr Tyr Asp Ala Trp Val Lys Phe Asn Arg Phe Arg Arg Glu 
260 265 270 

ATG ACT TTA ACT GTA TTA GAT CTA ATT GTA CTT TTC CCA TTT TAT GAT 
Met Thr Leu Thr Val Leu Asp Leu lie Val Leu Phe Pro Phe Tyr Asp 
275 280 285 

ATT CGG TTA TAC TCA AAA GGG GTT AAA ACA GAA CTA ACA AGA GAC ATT 
lie Arg Leu Tyr Ser Lys Gly Val Lys Thr Glu Leu Thr Arg Asp lie 
290 295 300 
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TTT ACG GAT CCA ATT TTT TTA CTT ACT ACG CTT CAG AAG TAC GGA CCA 960 
Phe Thr Asp Pro lie Phe Leu Leu Thr Thr Leu Gin Lys Tyr Gly Pro 
305 310 315 320 

ACT TTT TTG AGT ATA GAA AAC TCT ATT CGA AAA CCT CAT TTA TTT GAT 1008 
Thr Phe Leu Ser lie Glu Asn Ser lie Arg Lys Pro His Leu Phe Asp 
325 330 335 

TAT TTA CAG GGG ATT GAA TTT CAT ACG CGT CTT CAA CCT GGT TAC TTT 10 56 

Tyr Leu Gin Gly lie Glu Phe His Thr Arg Leu Gin Pro Gly Tyr Phe 
340 345 350 

GGG AAA GAT TCT TTC AAT TAT TGG TCT GGT AAT TAT GTA GAA ACT AGA 1104 
Gly Lys Asp Ser Phe Asn Tyr Trp Ser Gly Asn Tyr Val Glu Thr Arg 
355 360 365 

CCT AGT ATA GGA TCT AGT AAG ACA ATT ACT TCC CCA TTT TAT GGA GAT 1152 
Pro Ser lie Gly Ser Ser Lys Thr lie Thr Ser Pro Phe Tyr Gly Asp 
370 375 380 

AAA TCT ACT GAA CCT GTA CAA AAG CTA AGC TTT GAT GGA CAA AAA GTT 12 00 

Lys Ser Thr Glu Pro Val Gin Lys Leu Ser Phe Asp Gly Gin Lys Val 
385 390 395 400 

TAT CGA ACT ATA GCT AAT ACA GAC GTA GCG GCT TGG CCG AAT GGT AAG 12 4 8 

Tyr Arg Thr lie Ala Asn Thr Asp Val Ala Ala Trp Pro Asn Gly Lys 
405 410 415 

GTA TAT TTA GGT GTT ACG AAA GTT GAT TTT AGT CAA TAT GAT GAT CAA 12 96 

Val Tyr Leu Gly Val Thr Lys Val Asp Phe Ser Gin Tyr Asp Asp Gin 
420 425 430 

AAA AAT GAA ACT AGT ACA CAA ACA TAT GAT TCA AAA AGA AAC AAT GGC 1344 
Lys Asn Glu Thr Ser Thr Gin Thr Tyr Asp Ser Lys Arg Asn Asn Gly 
435 440 445 

CAT GTA AGT GCA CAG GAT TCT ATT GAC CAA TTA CCG CCA GAA ACA ACA 13 92 

His Val Ser Ala Gin Asp Ser lie Asp Gin Leu Pro Pro Glu Thr Thr 
450 455 460 

GAT GAA CCA CTT GAA AAA GCA TAT AGT CAT CAG CTT AAT TAC GCG GAA 1440 
Asp Glu Pro Leu Glu Lys Ala Tyr Ser His Gin Leu Asn Tyr Ala Glu 
465 470 475 480 

TGT TTC TTA ATG CAG GAC CGT CGT GGA ACA ATT CCA TTT TTT ACT TGG 1488 
Cys Phe Leu Met Gin Asp Arg Arg Gly Thr lie Pro Phe Phe Thr Trp 
485 490 495 

ACA CAT AGA AGT GTA GAC TTT TTT AAT ACA ATT GAT GCT GAA AAG ATT 1536 
Thr His Arg Ser Val Asp Phe Phe Asn Thr lie Asp Ala Glu Lys lie 
500 505 510 
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ACT CAA CTT CCA GTA GTG AAA GCA TAT GCC TTG TCT TCA GGT GCT TCC 1584 
Thr Gin Leu Pro Val Val Lys Ala Tyr Ala Leu Ser Ser Gly Ala Ser 
515 520 525 

ATT ATT GAA GGT CCA GGA TTC ACA GGA GGA AAT TTA CTA TTC CTA AAA 16 32 

lie lie Glu Gly Pro Gly Phe Thr Gly Gly Asn Leu Leu Phe Leu Lys 
530 535 540 

GAA TCT AGT AAT TCA ATT GCT AAA TTT AAA GTT ACA TTA AAT TCA GCA 16 8 0 

Glu Ser Ser Asn Ser lie Ala Lys Phe Lys Val Thr Leu Asn Ser Ala 
545 550 555 560 

GCC TTG TTA CAA CGA TAT CGT GTA AGA ATA CGC TAT GCT TCT ACC ACT 172 8 

Ala Leu Leu Gin Arg Tyr Arg Val Arg lie Arg Tyr Ala Ser Thr Thr 
565 570 575 

AAC TTA CGA CTT TTT GTG CAA AAT TCA AAC AAT GAT TTT CTT GTC ATC 17 76 

Asn Leu Arg Leu Phe Val Gin Asn Ser Asn Asn Asp Phe Leu Val lie 
580 585 590 

TAC ATT AAT AAA ACT ATG AAT AAA GAT GAT GAT TTA ACA TAT CAA ACA 182 4 

Tyr lie Asn Lys Thr Met Asn Lys Asp Asp Asp Leu Thr Tyr Gin Thr 
595 600 605 

TTT GAT CTC GCA ACT ACT AAT TCT AAT ATG GGG TTC TCG GGT GAT AAG 1872 
Phe Asp Leu Ala Thr Thr Asn Ser Asn Met Gly Phe Ser Gly Asp Lys 
610 615 620 

AAT GAA CTT ATA ATA GGA GCA GAA TCT TTC GTT TCT AAT GAA AAA ATC 1920 
Asn Glu Leu He He Gly Ala Glu Ser Phe Val Ser Asn Glu Lys He 
625 630 635 640 

TAT ATA GAT AAG ATA GAA TTT ATC CCA GTA CAA TTG TAA 1959 
Tyr He Asp Lys He Glu Phe He Pro Val Gin Leu 
645 650 



(2) INFORMATION FOR SEQ ID NO: 16: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 652 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 

Met Asn Pro Asn Asn Arg Ser Glu His Asp Thr He Lys Val Thr Pro 
15 10 15 

Asn Ser Glu Leu Gin Thr Asn His Asn Gin Tyr Pro Leu Ala Asp Asn 
20 25 30 
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Pro Asn Ser Thr Leu Glu Glu Leu Asn Tyr Lys Glu Phe Leu Arg Met 
35 40 45 



Thr Glu Asp Ser Ser Thr Glu Val Leu Asp Asn Ser Thr Val Lys Asp 
50 55 60 

Ala Val Gly Thr Gly He Ser Val Val Gly Gin He Leu Gly Val Val 
65 70 75 80 

Gly Val Pro Phe Ala Gly Ala Leu Thr Ser Phe Tyr Gin Ser Phe Leu 
85 90 95 

Asn Thr He Trp Pro Ser Asp Ala Asp Pro Trp Lys Ala Phe Met Ala 
100 105 110 

Gin Val Glu Val Leu lie Asp Lys Lys He Glu Glu Tyr Ala Lys Ser 
115 120 125 

Lys Ala Leu Ala Glu Leu Gin Gly Leu Gin Asn Asn Phe Glu Asp Tyr 
130 135 140 

Val Asn Ala Leu Asn Ser Trp Lys Lys Thr Pro Leu Ser Leu Arg Ser 
145 150 155 160 

Lys Arg Ser Gin Asp Arg He Arg Glu Leu Phe Ser Gin Ala Glu Ser 
165 170 175 

His Phe Arg Asn Ser Met Pro Ser Phe Ala Val Ser Lys Phe Glu Val 
180 185 190 

Leu Phe Leu Pro Thr Tyr Ala Gin Ala Ala Asn Thr His Leu Leu Leu 
195 200 205 

Leu Lys Asp Ala Gin Val Phe Gly Glu Glu Trp Gly Tyr Ser Ser Glu 
210 215 220 

Asp Val Ala Glu Phe Tyr His Arg Gin Leu Lys Leu Thr Gin Gin Tyr 
225 230 235 240 

Thr Asp His Cys Val Asn Trp Tyr Asn Val Gly Leu Asn Gly Leu Arg 
245 250 255 

Gly Ser Thr Tyr Asp Ala Trp Val Lys Phe Asn Arg Phe Arg Arg Glu 
260 265 270 

Met Thr Leu Thr Val Leu Asp Leu He Val Leu Phe Pro Phe Tyr Asp 
275 280 285 

He Arg Leu Tyr Ser Lys Gly Val Lys Thr Glu Leu Thr Arg Asp lie 
290 295 300 

Phe Thr Asp Pro He Phe Leu Leu Thr Thr Leu Gin Lys Tyr Gly Pro 
305 310 315 320 
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Thr Phe Leu Ser lie Glu Asn Ser lie Arg Lys Pro His Leu Phe Asp 
325 330 335 



Tyr Leu Gin Gly lie Glu Phe His Thr Arg Leu Gin Pro Gly Tyr Phe 
340 345 350 

Gly Lys Asp Ser Phe Asn Tyr Trp Ser Gly Asn Tyr Val Glu Thr Arg 
355 360 365 

Pro Ser lie Gly Ser Ser Lys Thr lie Thr Ser Pro Phe Tyr Gly Asp 
370 375 380 

Lys Ser Thr Glu Pro Val Gin Lys Leu Ser Phe Asp Gly Gin Lys Val 
385 390 395 400 

Tyr Arg Thr lie Ala Asn Thr Asp Val Ala Ala Trp Pro Asn Gly Lys 
405 410 415 

Val Tyr Leu Gly Val Thr Lys Val Asp Phe Ser Gin Tyr Asp Asp Gin 
420 425 430 

Lys Asn Glu Thr Ser Thr Gin Thr Tyr Asp Ser Lys Arg Asn Asn Gly 
435 440 445 

His Val Ser Ala Gin Asp Ser lie Asp Gin Leu Pro Pro Glu Thr Thr 
450 455 460 

Asp Glu Pro Leu Glu Lys Ala Tyr Ser His Gin Leu Asn Tyr Ala Glu 
465 470 475 480 

Cys Phe Leu Met Gin Asp Arg Arg Gly Thr lie Pro Phe Phe Thr Trp 
485 490 495 

Thr His Arg Ser Val Asp Phe Phe Asn Thr lie Asp Ala Glu Lys lie 
500 505 510 

Thr Gin Leu Pro Val Val Lys Ala Tyr Ala Leu Ser Ser Gly Ala Ser 
515 520 525 

lie lie Glu Gly Pro Gly Phe Thr Gly Gly Asn Leu Leu Phe Leu Lys 
530 535 540 

Glu Ser Ser Asn Ser lie Ala Lys Phe Lys Val Thr Leu Asn Ser Ala 
545 550 555 560 

Ala Leu Leu Gin Arg Tyr Arg Val Arg lie Arg Tyr Ala Ser Thr Thr 
565 570 575 

Asn Leu Arg Leu Phe Val Gin Asn Ser Asn Asn Asp Phe Leu Val lie 
580 585 590 

Tyr lie Asn Lys Thr Met Asn Lys Asp Asp Asp Leu Thr Tyr Gin Thr 
595 600 605 
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Phe Asp Leu Ala 
610 

Asn Glu Leu lie 
625 

Tyr lie Asp Lys 



Thr Thr Asn Ser 
615 

He Gly Ala Glu 
630 

He Glu Phe He 
645 



Asn Met Gly Phe 
620 

Ser Phe Val Ser 
635 

Pro Val Gin Leu 
650 



Ser Gly Asp Lys 



Asn Glu Lys He 
640 



(2) INFORMATION FOR SEQ ID NO: 17: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1959 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ix) FEATURE: 

(A) NAME / KEY : CDS 

(B) LOCATION: 1. .1956 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 

ATG AAT CCA AAC AAT CGA AGT GAA CAT GAT ACG ATA AAG GTT ACA CCT 4 8 

Met Asn Pro Asn Asn Arg Ser Glu His Asp Thr He Lys Val Thr Pro 
15 10 15 

AAC AGT GAA TTG CAA ACT AAC CAT AAT CAA TAT CCT TTA GCT GAC AAT 96 
Asn Ser Glu Leu Gin Thr Asn His Asn Gin Tyr Pro Leu Ala Asp Asn 
20 25 30 

CCA AAT TCA ACA CTA GAA GAA TTA AAT TAT AAA GAA TTT TTA AGA ATG 14 4 

Pro Asn Ser Thr Leu Glu Glu Leu Asn Tyr Lys Glu Phe Leu Arg Met 
35 40 45 

ACT GAA GAC AGT TCT ACG GAA GTG CTA GAC AAC TCT ACA GTA AAA GAT 192 
Thr Glu Asp Ser Ser Thr Glu Val Leu Asp Asn Ser Thr Val Lys Asp 
50 55 60 

GCA G.TT GGG ACA GGA ATT TCT GTT GTA GGG CAG ATT TTA GGT GTT GTA 24 0 

Ala Val Gly Thr Gly He Ser Val Val Gly Gin He Leu Gly Val Val 
65 70 75 80 

GGA GTT CCA TTT GCT GGG GCA CTC ACT TCA TTT TAT CAA TCA TTT CTT 288 
Gly Val Pro Phe Ala Gly Ala Leu Thr Ser Phe Tyr Gin Ser Phe Leu 
85 90 95 

AAC ACT ATA TGG CCA AGT GAT GCT GAC CCA TGG AAG GCT TTT ATG GCA 3 36 

Asn Thr He Trp Pro Ser Asp Ala Asp Pro Trp Lys Ala Phe Met Ala 
100 105 110 

CAA GTT GAA GTA CTG ATA GAT AAG AAA ATA GAG GAG TAT GCT AAA AGT 384 
Gin Val Glu Val Leu He Asp Lys Lys He Glu Glu Tyr Ala Lys Ser 
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115 



120 



125 



AAA GCT CTT GCA GAG TTA CAG GGT CTT CAA AAT AAT TTC GAA GAT TAT 43 2 

Lys Ala Leu Ala Glu Leu Gin Gly Leu Gin Asn Asn Phe Glu Asp Tyr 
130 135 140 

GTT AAT GCG TTA AAT TCC TGG AAG AAA ACA CCT TTA AGT TTG CGA AGT 4 80 

Val Asn Ala Leu Asn Ser Trp Lys Lys Thr Pro Leu Ser Leu Arg Ser 
145 150 155 160 

AAA AGA AGC CAA GAT CGA ATA AGG GAA CTT TTT TCT CAA GCA GAA AGT 52 8 

Lys Arg Ser Gin Asp Arg lie Arg Glu' Leu Phe Ser Gin Ala Glu Ser 
165 170 175 

CAT TTT CGT AAT TCC ATG CCG TCA TTT GCA GTT TCC AAA TTC GAA GTG 5 76 

His Phe Arg Asn Ser Met Pro Ser Phe Ala Val Ser Lys Phe Glu Val 
180 185 190 

CTG TTT CTA CCA ACA TAT GCA CAA GCT GCA AAT ACA CAT TTA TTG CTA 624 
Leu Phe Leu Pro Thr Tyr Ala Gin Ala Ala Asn Thr His Leu Leu Leu 
195 200 205 

TTA AAA GAT GCT CAA GTT TTT GGA GAA GAA TGG GGA TAT TCT TCA GAA 6 72 

Leu Lys Asp Ala Gin Val Phe Gly Glu Glu Trp Gly Tyr Ser Ser Glu 
210 215 220 

GAT GTT GCT GAA TTT TAT CAT AGA CAA TTA AAA CTT ACA CAA CAA TAC 720 
Asp Val Ala Glu Phe Tyr His Arg Gin Leu Lys Leu Thr Gin Gin Tyr 
225 230 235 240 

ACT GAC CAT TGT GTT AAT TGG TAT AAT GTT GGA TTA AAT GGT TTA AGA 768 
Thr Asp His Cys Val Asn Trp Tyr Asn Val Gly Leu Asn Gly Leu Arg 
245 250 255 

GGT TCA ACT TAT GAT GCA TGG GTC AAA TTT AAC CGT TTT CGC AGA GAA 816 
Gly Ser Thr Tyr Asp Ala Trp Val Lys Phe Asn Arg Phe Arg Arg Glu 
260 265 270 

ATG ACT TTA ACT GTA TTA GAT CTA ATT GTA CTT TTC CCA TTT TAT GAT 864 
Met Thr Leu Thr Val Leu Asp Leu lie Val Leu Phe Pro Phe Tyr Asp 
275 280 285 

ATT CGG TTA TAC TCA AAA GGG GTT AAA ACA GAA CTA ACA AGA GAC ATT 912 
lie Arg Leu Tyr Ser Lys Gly Val Lys Thr Glu Leu Thr Arg Asp He 
290 295 300 

TTT ACG GAT CCA ATT TTT ACC CTT AAT ACA CTA CAG AAG TGC GGA CCA 96 0 

Phe Thr Asp Pro He Phe Thr Leu Asn Thr Leu Gin Lys Cys Gly Pro 
305 310 315 320 

ACT TTT TTG AGT ATA GAA AAC TCT ATT CGA AAA CCT CAT TTA TTT GAT 1008 
Thr Phe Leu Ser He Glu Asn Ser He Arg Lys Pro His Leu Phe Asp 
325 330 335 
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TAT TTA CAG GGG ATT GAA TTT CAT ACG CGT CTT CAA CCT GGT TAC TTT 10 56 

Tyr Leu Gin Gly lie Glu Phe His Thr Arg Leu Gin Pro Gly Tyr Phe 
340 345 350 

GGG AAA GAT TCT TTC AAT TAT TGG TCT GGT AAT TAT GTA GAA ACT AGA 1104 
Gly Lys Asp Ser Phe Asn Tyr Trp Ser Gly Asn Tyr Val Glu Thr Arg 
355 360 365 

CCT AGT ATA GGA TCT AGT AAG ACA ATT ACT TCC CCA TTT TAT GGA GAT 1152 
Pro Ser lie Gly Ser Ser Lys Thr lie Thr Ser Pro Phe Tyr Gly Asp 
370 375 380 

AAA TCT ACT GAA CCT GTA CAA AAG CTA AGC TTT GAT GGA CAA AAA GTT 12 00 

Lys Ser Thr Glu Pro Val Gin Lys Leu Ser Phe Asp Gly Gin Lys Val 
385 390 395 400 

TAT CGA ACT ATA GCT AAT ACA GAC GTA GCG GCT TGG CCG AAT GGT AAG 124 8 

Tyr Arg Thr lie Ala Asn Thr Asp Val Ala Ala Trp Pro Asn Gly Lys 
405 410 415 

GTA TAT TTA GGT GTT ACG AAA GTT GAT TTT AGT CAA TAT GAT GAT CAA 1296 
Val Tyr Leu Gly Val Thr Lys Val Asp Phe Ser Gin Tyr Asp Asp Gin 
420 425 430 

AAA AAT GAA ACT AGT ACA CAA ACA TAT GAT TCA AAA AGA AAC AAT GGC 1344 
Lys Asn Glu Thr Ser Thr Gin Thr Tyr Asp Ser Lys Arg Asn Asn Gly 
435 440 445 

CAT GTA AGT GCA CAG GAT TCT ATT GAC CAA TTA CCG CCA GAA ACA ACA 13 92 

His Val Ser Ala Gin Asp Ser lie Asp Gin Leu Pro Pro Glu Thr Thr 
450 455 460 

GAT GAA CCA CTT GAA AAA GCA TAT AGT CAT CAG CTT AAT TAC GCG GAA 144 0 

Asp Glu Pro Leu Glu Lys Ala Tyr Ser His Gin Leu Asn Tyr Ala Glu 
465 470 475 480 

TGT TTC TTA ATG CAG GAC CGT CGT GGA ACA ATT CCA TTT TTT ACT TGG 1488 
Cys Phe Leu Met Gin Asp Arg Arg Gly Thr lie Pro Phe Phe Thr Trp 
485 490 . 495 

ACA CAT AGA AGT GTA GAC TTT TTT AAT ACA ATT GAT GCT GAA AAG ATT 1536 
Thr His Arg Ser Val Asp Phe Phe Asn Thr lie Asp Ala Glu Lys lie 
500 505 510 

ACT CAA CTT CCA GTA GTG AAA GCA TAT GCC TTG TCT TCA GGT GCT TCC 1584 
Thr Gin Leu Pro Val Val Lys Ala Tyr Ala Leu Ser Ser Gly Ala Ser 
515 520 525 

ATT ATT GAA GGT CCA GGA TTC ACA GGA GGA AAT TTA CTA TTC CTA AAA 1632 
lie lie Glu Gly Pro Gly Phe Thr Gly Gly Asn Leu Leu Phe Leu Lys 
530 535 540 
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GAA TCT AGT AAT TCA ATT GCT AAA TTT AAA GTT ACA TTA AAT TCA GCA 
Glu Ser Ser Asn Ser lie Ala Lys Phe Lys Val Thr Leu Asn Ser Ala 
545 550 555 560 



1680 



GCC TTG TTA CAA CGA TAT CGT GTA AGA ATA CGC TAT GCT TCT ACC ACT 172 8 

Ala Leu Leu Gin Arg Tyr Arg Val Arg lie Arg Tyr Ala Ser Thr Thr 
565 570 575 

AAC TTA CGA CTT TTT GTG CAA AAT TCA AAC AAT GAT TTT CTT GTC ATC 17 76 

Asn Leu Arg Leu Phe Val Gin Asn Ser Asn Asn Asp Phe Leu Val lie 
580 585 590 

TAC ATT AAT AAA ACT ATG AAT AAA GAT GAT GAT TTA ACA TAT CAA ACA 182 4 

Tyr lie Asn Lys Thr Met Asn Lys Asp Asp Asp Leu Thr Tyr Gin Thr 
595 600 605 

TTT GAT CTC GCA ACT ACT AAT TCT AAT ATG GGG TTC TCG GGT GAT AAG 18 72 

Phe Asp Leu Ala Thr Thr Asn Ser Asn Met Gly Phe Ser Gly Asp Lys 
610 615 620 



AAT GAA CTT ATA ATA GGA GCA GAA TCT TTC GTT TCT AAT GAA AAA ATC 192 0 

Asn Glu Leu lie lie Gly Ala Glu Ser Phe Val Ser Asn Glu Lys lie 
625 630 635 640 



TAT ATA GAT AAG ATA GAA TTT ATC CCA GTA CAA TTG TAA 195 9 

Tyr lie Asp Lys lie Glu Phe lie Pro Val Gin Leu 
645 650 



(2) INFORMATION FOR SEQ ID NO: 18: 



( i ) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 652 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:18: 



Met Asn Pro Asn Asn 
1 5 

Asn Ser Glu Leu Gin 
20 

Pro Asn Ser Thr Leu 
35 

Thr Glu Asp Ser Ser 
50 

Ala Val Gly Thr Gly 
65 



Arg Ser Glu His Asp Thr 
10 

Thr Asn His Asn Gin Tyr 
25 

Glu Glu Leu Asn Tyr Lys 
40 

Thr Glu Val Leu Asp Asn 
55 

He Ser Val Val Gly Gin 
70 75 



He Lys Val Thr Pro 
15 

Pro Leu Ala Asp Asn 
30 

Glu Phe Leu Arg Met 
45 

Ser Thr Val Lys Asp 
60 

He Leu Gly Val Val 
80 
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Gly Val Pro Phe Ala Gly Ala Leu Thr Ser Phe Tyr Gin Ser Phe Leu 
85 90 95 



Asn Thr He Trp Pro Ser Asp Ala Asp Pro Trp Lys Ala Phe Met Ala 
100 105 110 

Gin Val Glu Val Leu He Asp Lys Lys He Glu Glu Tyr Ala Lys Ser 
115 120 125 

Lys Ala Leu Ala Glu Leu Gin Gly Leu Gin Asn Asn Phe Glu Asp Tyr 
130 * 135 140 

Val Asn Ala Leu Asn Ser Trp Lys Lys Thr Pro Leu Ser Leu Arg Ser 
145 150 155 160 

Lys Arg Ser Gin Asp Arg He Arg Glu Leu Phe Ser Gin Ala Glu Ser 
165 170 175 

His Phe Arg Asn Ser Met Pro Ser Phe Ala Val Ser Lys Phe Glu Val 
180 185 190 

Leu Phe Leu Pro Thr Tyr Ala Gin Ala Ala Asn Thr His Leu Leu Leu 
195 200 205 

Leu Lys Asp Ala Gin Val Phe Gly Glu Glu Trp Gly Tyr Ser Ser Glu 
210 215 220 

Asp Val Ala Glu Phe Tyr His Arg Gin Leu Lys Leu Thr Gin Gin Tyr 
225 230 235 240 

Thr Asp His Cys Val Asn Trp Tyr Asn Val Gly Leu Asn Gly Leu Arg 
245 250 255 

Gly Ser Thr Tyr Asp Ala Trp Val Lys Phe Asn Arg Phe Arg Arg Glu 
260 265 270 

Met Thr Leu Thr Val Leu Asp Leu He Val Leu Phe Pro Phe Tyr Asp 
275 280 285 

He Arg Leu Tyr Ser Lys Gly Val Lys Thr Glu Leu Thr Arg Asp He 
290 295 300 

Phe Thr Asp Pro He Phe Thr Leu Asn Thr Leu Gin Lys Cys Gly Pro 
305 310 315 320 

Thr Phe Leu Ser He Glu Asn Ser He Arg Lys Pro His Leu Phe Asp 
325 330 335 

Tyr Leu Gin Gly He Glu Phe His Thr Arg Leu Gin Pro Gly Tyr Phe 
340 345 350 

Gly Lys Asp Ser Phe Asn Tyr Trp Ser Gly Asn Tyr Val Glu Thr Arg 
355 360 365 
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Pro Ser lie Gly Ser Ser Lys Thr lie Thr Ser Pro Phe Tyr Gly Asp 
370 375 380 



Lys Ser Thr Glu Pro Val Gin Lys Leu Ser Phe Asp Gly Gin Lys Val 
385 390 395 400 

Tyr Arg Thr lie Ala Asn Thr Asp Val Ala Ala Trp Pro Asn Gly Lys 
405 410 415 

Val Tyr Leu Gly Val Thr Lys Val Asp Phe Ser Gin Tyr Asp Asp Gin 
420 425 430 

Lys Asn Glu Thr Ser Thr Gin Thr Tyr Asp Ser Lys Arg Asn Asn Gly 
435 440 445 

His Val Ser Ala Gin Asp Ser lie Asp Gin Leu Pro Pro Glu Thr Thr 
450 455 460 

Asp Glu Pro Leu Glu Lys Ala Tyr Ser His Gin Leu Asn Tyr Ala Glu 
465 470 475 480 

Cys Phe Leu Met Gin Asp Arg Arg Gly Thr lie Pro Phe Phe Thr Trp 
485 490 495 

Thr His Arg Ser Val Asp Phe Phe Asn Thr lie Asp Ala Glu Lys lie 
500 505 510 

Thr Gin Leu Pro Val Val Lys Ala Tyr Ala Leu Ser Ser Gly Ala Ser 
515 520 525 

lie lie Glu Gly Pro Gly Phe Thr Gly Gly Asn Leu Leu Phe Leu Lys 
530 535 540 

Glu Ser Ser Asn Ser Tie Ala Lys Phe Lys Val Thr Leu Asn Ser Ala 
545 550 555 560 

Ala Leu Leu Gin Arg Tyr Arg Val Arg lie Arg Tyr Ala Ser Thr Thr 
565 570 575 

Asn Leu Arg Leu Phe Val Gin Asn Ser Asn Asn Asp Phe Leu Val lie 
580 585 590 

Tyr lie Asn Lys Thr Met Asn Lys Asp Asp Asp Leu Thr Tyr Gin Thr 
595 600 605 

Phe Asp Leu Ala Thr Thr Asn Ser Asn Met Gly Phe Ser Gly Asp Lys 
610 615 620 

Asn Glu Leu lie lie Gly Ala Glu Ser Phe Val Ser Asn Glu Lys lie 
625 630 635 640 



Tyr lie Asp Lys lie Glu Phe lie Pro Val Gin Leu 
645 650 
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(2) INFORMATION FOR SEQ ID NO: 19: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1959 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(ix) FEATURE: 

(A) NAME / KEY : CDS 

(B) LOCATION: 1. .1956 



(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 19: 

ATG AAT CCA AAC AAT CGA AGT GAA CAT GAT ACG ATA AAG GTT ACA CCT 
Met Asn Pro Asn Asn Arg Ser Glu His Asp Thr lie Lys Val Thr Pro 
15 10 15 



AAC AGT GAA TTG CAA ACT AAC CAT AAT CAA TAT CCT TTA GCT GAC AAT 
Asn Ser Glu Leu Gin Thr Asn His Asn Gin Tyr Pro Leu Ala Asp Asn 
20 25 30 



CCA AAT TCA ACA CTA GAA GAA TTA AAT TAT AAA GAA TTT TTA AGA ATG 
Pro Asn Ser Thr Leu Glu Glu Leu Asn Tyr Lys Glu Phe Leu Arg Met 
35 40 45 



ACT GAA GAC AGT TCT ACG GAA GTG CTA GAC AAC TCT ACA GTA AAA GAT 
Thr Glu Asp Ser Ser Thr Glu Val Leu Asp Asn Ser Thr Val Lys Asp 
50 55 60 



GCA GTT GGG ACA GGA ATT TCT GTT GTA GGG CAG ATT TTA GGT GTT GTA 
Ala Val Gly Thr Gly He Ser Val Val Gly Gin He Leu Gly Val Val 
65 70 75 80 



GGA GTT CCA TTT GCT GGG GCA CTC ACT TCA TTT TAT CAA TCA TTT CTT 
Gly Val Pro Phe Ala Gly Ala Leu Thr Ser Phe Tyr Gin Ser Phe Leu 
85 90 95 



AAC ACT ATA TGG CCA AGT GAT GCT GAC CCA TGG AAG GCT TTT ATG GCA 
Asn Thr He Trp Pro Ser Asp Ala Asp Pro Trp Lys Ala Phe Met Ala 
100 105 110 

CAA GTT GAA GTA CTG ATA GAT AAG AAA ATA GAG GAG TAT GCT AAA AGT 
Gin Val Glu Val Leu He Asp Lys Lys He Glu Glu Tyr Ala Lys Ser 
115 120 125 

AAA GCT CTT GCA GAG TTA CAG GGT CTT CAA AAT AAT TTC GAA GAT TAT 
Lys Ala Leu Ala Glu Leu Gin Gly Leu Gin Asn Asn Phe Glu Asp Tyr 
130 135 140 



GTT AAT GCG TTA AAT TCC TGG AAG AAA ACA CCT TTA AGT TTG CGA AGT 
Val Asn Ala Leu Asn Ser Trp Lys Lys Thr Pro Leu Ser Leu Arg Ser 
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145 



150 



155 



160 



AAA AGA AGC CAA GAT CGA ATA AGG GAA CTT TTT TCT CAA GCA GAA AGT 52 8 

Lys Arg Ser Gin Asp Arg lie Arg Glu Leu Phe Ser Gin Ala Glu Ser 
165 170 175 

CAT TTT CGT AAT TCC ATG CCG TCA TTT GCA GTT TCC AAA TTC GAA GTG 576 
His Phe Arg Asn Ser Met Pro Ser Phe Ala Val Ser Lys Phe Glu Val 
180 185 190 

CTG TTT CTA CCA ACA TAT GCA CAA GCT GCA AAT ACA CAT TTA TTG CTA 624 
Leu Phe Leu Pro Thr Tyr Ala Gin Ala Ala Asn Thr His Leu Leu Leu 
195 200 205 

TTA AAA GAT GCT CAA GTT TTT GGA GAA GAA TGG GGA TAT TCT TCA GAA 6 72 

Leu Lys Asp Ala Gin Val Phe Gly Glu Glu Trp Gly Tyr Ser Ser Glu 
210 215 220 

GAT GTT GCT GAA TTT TAT CAT AGA CAA TTA AAA CTT ACA CAA CAA TAC 720 
Asp Val Ala Glu Phe Tyr His Arg Gin Leu Lys Leu Thr Gin Gin Tyr 
225 230 235 240 

ACT GAC CAT TGT GTT AAT TGG TAT AAT GTT GGA TTA AAT GGT TTA AGA 76 8 

Thr Asp His Cys Val Asn Trp Tyr Asn Val Gly Leu Asn Gly Leu Arg 
245 250 255 

GGT TCA ACT TAT GAT GCA TGG GTC AAA TTT AAC CGT TTT CGC AGA GAA 816 
Gly Ser Thr Tyr Asp Ala Trp Val Lys Phe Asn Arg Phe Arg Arg Glu 
260 265 270 

ATG ACT TTA ACT GTA TTA GAT CTA ATT GTA CTT TTC CCA TTT TAT GAT 864 
Met Thr Leu Thr Val Leu Asp Leu lie Val Leu Phe Pro Phe Tyr Asp 
275 280 285 

ATT CGG TTA TAC TCA AAA GGG GTT AAA ACA GAA CTA ACA AGA GAC ATT 912 
He Arg Leu Tyr Ser Lys Gly Val Lys Thr Glu Leu Thr Arg Asp He 
290 295 300 

TTT ACG GAT CCA ATT TTT GCC GTT AAT ACT CTG TGG GAA TAC GGA CCA 960 
Phe Thr Asp Pro He Phe Ala Val Asn Thr Leu Trp Glu Tyr Gly Pro 
305 310 315 320 

ACT TTT TTG AGT ATA GAA AAC TCT ATT CGA AAA CCT CAT TTA TTT GAT 1008 
Thr Phe Leu Ser He Glu Asn Ser He Arg Lys Pro His Leu Phe Asp 
325 330 335 

TAT TTA CAG GGG ATT GAA TTT CAT ACG CGT CTT CAA CCT GGT TAC TTT 10 56 

Tyr Leu Gin Gly He Glu Phe His Thr Arg Leu Gin Pro Gly Tyr Phe 
340 345 350 

GGG AAA GAT TCT TTC AAT TAT TGG TCT GGT AAT TAT GTA GAA ACT AGA 1104 
Gly Lys Asp Ser Phe Asn Tyr Trp Ser Gly Asn Tyr Val Glu Thr Arg 
355 360 365 
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CCT AGT ATA GGA TCT AGT AAG ACA ATT ACT TCC CCA TTT TAT GGA GAT 1152 
Pro Ser lie Gly Ser Ser Lys Thr lie Thr Ser Pro Phe Tyr Gly Asp 
370 375 380 

AAA TCT ACT GAA CCT GTA CAA AAG CTA AGC TTT GAT GGA CAA AAA GTT 12 00 

Lys Ser Thr Glu Pro Val Gin Lys Leu Ser Phe Asp Gly Gin Lys Val 
385 390 395 400 

TAT CGA ACT ATA GCT AAT ACA GAC GTA GCG GCT TGG CCG AAT GGT AAG 1248 
Tyr Arg Thr lie Ala Asn Thr Asp Val Ala Ala Trp Pro Asn Gly Lys 
405 410 415 

GTA TAT TTA GGT GTT ACG AAA GTT GAT TTT AGT CAA TAT GAT GAT CAA 12 96 

Val Tyr Leu Gly Val Thr Lys Val Asp Phe Ser Gin Tyr Asp Asp Gin 
420 425 430 

AAA AAT GAA ACT AGT ACA CAA ACA TAT GAT TCA AAA AGA AAC AAT GGC 1344 
Lys Asn Glu Thr Ser Thr Gin Thr Tyr Asp Ser Lys Arg Asn Asn Gly 
435 440 445 

CAT GTA AGT GCA CAG GAT TCT ATT GAC CAA TTA CCG CCA GAA ACA ACA 13 92 

His Val Ser Ala Gin Asp Ser lie Asp Gin Leu Pro Pro Glu Thr Thr 
450 455 460 

GAT GAA CCA CTT GAA AAA GCA TAT AGT CAT CAG CTT AAT TAC GCG GAA 144 0 

Asp Glu Pro Leu Glu Lys Ala Tyr Ser His Gin Leu Asn Tyr Ala Glu 
465 470 475 480 

TGT TTC TTA ATG CAG GAC CGT CGT GGA ACA ATT CCA TTT TTT ACT TGG 1488 
Cys Phe Leu Met Gin Asp Arg Arg Gly Thr lie Pro Phe Phe Thr Trp 
485 490 495 

ACA CAT AGA AGT GTA GAC TTT TTT AAT ACA ATT GAT GCT GAA AAG ATT 1536 
Thr His Arg Ser Val Asp Phe Phe Asn Thr lie Asp Ala Glu Lys lie 
500 505 510 

ACT CAA CTT CCA GTA GTG AAA GCA TAT GCC TTG TCT TCA GGT GCT TCC 1584 
Thr Gin Leu Pro Val Val Lys Ala Tyr Ala Leu Ser Ser Gly Ala Ser 
515 520 525 

ATT ATT GAA GGT CCA GGA TTC ACA GGA GGA AAT TTA CTA TTC CTA AAA 1632 
He He Glu Gly Pro Gly Phe Thr Gly Gly Asn Leu Leu Phe Leu Lys 
530 535 540 

GAA TCT AGT AAT TCA ATT GCT AAA TTT AAA GTT ACA TTA AAT TCA GCA 1680 
Glu Ser Ser Asn Ser He Ala Lys Phe Lys Val Thr Leu Asn Ser Ala 
545 550 555 560 

GCC TTG TTA CAA CGA TAT CGT GTA AGA ATA CGC TAT GCT TCT ACC ACT 1728 
Ala Leu Leu Gin Arg Tyr Arg Val Arg He Arg Tyr Ala Ser Thr Thr 
565 570 575 

AAC TTA CGA CTT TTT GTG CAA AAT TCA AAC AAT GAT TTT CTT GTC ATC 1776 
Asn Leu Arg Leu Phe Val Gin Asn Ser Asn Asn Asp Phe Leu Val He 
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580 



585 



590 



TAC ATT AAT AAA ACT ATG AAT AAA GAT GAT GAT TTA ACA TAT CAA ACA 18 24 

Tyr lie Asn Lys Thr Met Asn Lys Asp Asp Asp Leu Thr Tyr Gin Thr 
595 600 605 

TTT GAT CTC GCA ACT ACT AAT TCT AAT ATG GGG TTC TCG GGT GAT AAG 1872 
Phe Asp Leu Ala Thr Thr Asn Ser Asn Met Gly Phe Ser Gly Asp Lys 
610 615 620 

AAT GAA CTT ATA ATA GGA GCA GAA TCT TTC GTT TCT AAT GAA AAA ATC 192 0 

Asn Glu -Leu He He Gly Ala Glu Ser Phe Val Ser Asn Glu Lys lie 
625 630 635 640 

TAT ATA GAT AAG ATA GAA TTT ATC CCA GTA CAA TTG TAA 1959 
Tyr He Asp Lys He Glu Phe He Pro Val Gin Leu 
645 650 



(2) INFORMATION FOR SEQ ID NO: 20: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6 52 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 20: 

Met Asn Pro Asn Asn Arg Ser Glu His Asp Thr He Lys Val Thr Pro 
15 10 15 

Asn Ser Glu Leu Gin Thr Asn His Asn Gin Tyr Pro Leu Ala Asp Asn 
20 25 30 

Pro Asn Ser Thr Leu Glu Glu Leu Asn Tyr Lys Glu Phe Leu Arg Met 
35 40 45 

Thr Glu Asp Ser Ser Thr Glu Val Leu Asp Asn Ser Thr Val Lys Asp 
50 55 60 

Ala Val Gly Thr Gly He Ser Val Val Gly Gin He Leu Gly Val Val 
65 70 75 80 

Gly Val Pro Phe Ala Gly Ala Leu Thr Ser Phe Tyr Gin Ser Phe Leu 
85 • 90 95 

Asn Thr He Trp Pro Ser Asp Ala Asp Pro Trp Lys Ala Phe Met Ala 
100 105 110 

Gin Val Glu Val Leu He Asp Lys Lys He Glu Glu Tyr Ala Lys Ser 
115 120 125 
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Lys Ala Leu Ala Glu Leu Gin Gly Leu Gin Asn Asn Phe Glu Asp Tyr 
130 135 140 



Val Asn Ala Leu Asn Ser Trp Lys Lys Thr Pro Leu Ser Leu Arg Ser 
145 150 155 160 

Lys Arg Ser Gin Asp Arg lie Arg Glu Leu Phe Ser Gin Ala Glu Ser 
165 170 175 

His Phe Arg Asn Ser Met Pro Ser Phe Ala Val Ser Lys Phe *Glu Val 
180 185 190 

Leu Phe Leu Pro Thr Tyr Ala Gin Ala Ala Asn Thr His Leu Leu Leu 
195 200 205 

Leu Lys Asp Ala Gin Val Phe Gly Glu Glu Trp Gly Tyr Ser Ser Glu 
210 215 220 

Asp Val Ala Glu Phe Tyr His Arg Gin Leu Lys Leu Thr Gin Gin Tyr 
225 230 235 240 

Thr Asp His Cys Val Asn Trp Tyr Asn Val Gly Leu Asn Gly Leu Arg 
245 250 255 

Gly Ser Thr Tyr Asp Ala Trp Val Lys Phe Asn Arg Phe Arg Arg Glu 
260 265 270 

Met Thr Leu Thr Val Leu Asp Leu lie Val Leu Phe Pro Phe Tyr Asp 
275 280 285 

lie Arg Leu Tyr Ser Lys Gly Val Lys Thr Glu Leu Thr Arg Asp lie 
290 295 300 

Phe Thr Asp Pro lie Phe Ala Val Asn Thr Leu Trp Glu Tyr Gly Pro 
305 310 315 320 

Thr Phe Leu Ser lie Glu Asn Ser lie Arg Lys Pro His Leu Phe Asp 
325 330 335 

Tyr Leu Gin Gly lie Glu Phe His Thr Arg Leu Gin Pro Gly Tyr Phe 
340 345 350 

Gly Lys Asp Ser Phe Asn Tyr Trp Ser Gly Asn Tyr Val Glu Thr Arg 
355 360 365 

Pro Ser lie Gly Ser Ser Lys Thr lie Thr Ser Pro Phe Tyr Gly Asp 
370 375 380 

Lys Ser Thr Glu Pro Val Gin Lys Leu Ser Phe Asp Gly Gin Lys Val 
385 390 395 400 

Tyr Arg Thr lie Ala Asn Thr Asp Val Ala Ala Trp Pro Asn Gly Lys 
405 410 415 
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Val Tyr Leu Gly Val Thr Lys Val Asp Phe Ser Gin Tyr Asp Asp Gin 
420 425 430 



Lys Asn Glu Thr Ser Thr Gin Thr Tyr Asp Ser Lys Arg Asn Asn Gly 
435 440 445 

His Val Ser Ala Gin Asp Ser lie Asp Gin Leu Pro Pro Glu Thr Thr 
450 455 460 

Asp Glu Pro Leu Glu Lys Ala Tyr Ser His Gin Leu Asn Tyr Ala Glu 
465 470 475 480 

Cys Phe Leu Met Gin Asp Arg Arg Gly Thr lie Pro Phe Phe Thr Trp 
485 490 495 

Thr His Arg Ser Val Asp Phe Phe Asn Thr lie Asp Ala Glu Lys lie 
500 505 510 

Thr Gin Leu Pro Val Val Lys Ala Tyr Ala Leu Ser Ser Gly Ala Ser 
515 520 525 

lie lie Glu Gly Pro Gly Phe Thr Gly Gly Asn Leu Leu Phe Leu Lys 
530 535 540 

Glu Ser Ser Asn Ser lie Ala Lys Phe Lys Val Thr Leu Asn Ser Ala 
545 550 555 560 

Ala Leu Leu Gin Arg Tyr Arg Val Arg lie Arg Tyr Ala Ser Thr Thr 
565 570 575 

Asn Leu Arg Leu Phe Val Gin Asn Ser Asn Asn Asp Phe Leu Val lie 
580 585 590 

Tyr lie Asn Lys Thr Met Asn Lys Asp Asp Asp Leu Thr Tyr Gin Thr 
595 600 605 

Phe Asp Leu Ala Thr Thr Asn Ser Asn Met Gly Phe Ser Gly Asp Lys 
610 " 615 620 

Asn Glu Leu He He Gly Ala Glu Ser Phe Val Ser Asn Glu Lys He 
625 630 635 640 

Tyr He Asp Lys He Glu Phe He Pro Val Gin Leu 
645 650 



(2) INFORMATION FOR SEQ ID NO: 21: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1959 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 
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(ix) FEATURE: 

(A) NAME /KEY: CDS 

(B) LOCATION: 1..1956 



(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 21: 

ATG AAT CCA AAC AAT CGA AGT GAA CAT GAT ACG ATA AAG GTT ACA CCT 48 
Met Asn Pro Asn" Asn Arg Ser Glu His Asp Thr lie Lys Val Thr Pro 
15 10 15 

AAC AGT GAA TTG CAA ACT AAC CAT AAT CAA TAT CCT TTA GCT GAC AAT 96 
Asn Ser Glu Leu Gin Thr Asn His Asn Gin Tyr Pro Leu Ala Asp Asn 
20 25 30 

CCA AAT TCA ACA CTA GAA GAA TTA AAT TAT AAA GAA TTT TTA AGA ATG 144 
Pro Asn Ser Thr Leu Glu Glu Leu Asn Tyr Lys Glu Phe Leu Arg Met 
35 40 45 

ACT GAA GAC AGT TCT ACG GAA GTG CTA GAC AAC TCT ACA GTA AAA GAT 192 
Thr Glu Asp Ser Ser Thr Glu Val Leu Asp Asn Ser Thr Val Lys Asp 
50 55 60 

GCA GTT GGG ACA GGA ATT TCT GTT GTA GGG CAG ATT TTA GGT GTT GTA 240 
Ala Val Gly Thr Gly He Ser Val Val Gly Gin He Leu Gly Val Val 
65 70 75 80 

GGA GTT CCA TTT GCT GGG GCA CTC ACT TCA TTT TAT CAA TCA TTT CTT 288 
Gly Val Pro Phe Ala Gly* Ala Leu Thr Ser Phe Tyr Gin Ser Phe Leu 
85 90 95 

AAC ACT ATA TGG CCA AGT GAT GCT GAC CCA TGG AAG GCT TTT ATG GCA 336 
Asn Thr He Trp Pro Ser Asp Ala Asp Pro Trp Lys Ala Phe Met Ala 
100 105 110 

CAA GTT GAA GTA CTG ATA GAT AAG AAA ATA GAG GAG TAT GCT AAA AGT 384 
Gin Val Glu Val Leu He Asp Lys Lys He Glu Glu Tyr Ala Lys Ser 
115 120 125 

AAA GCT CTT GCA GAG TTA CAG GGT CTT CAA AAT AAT TTC GAA GAT TAT 432 
Lys Ala Leu Ala Glu Leu Gin Gly Leu Gin Asn Asn Phe Glu Asp Tyr 
130 135 140 

GTT AAT GCG TTA AAT TCC TGG AAG AAA ACA CCT TTA AGT TTG CGA AGT 480 
Val Asn Ala Leu Asn Ser Trp Lys Lys Thr Pro Leu Ser Leu Arg Ser 
145 150 155 160 

AAA AGA AGC CAA GAT CGA ATA AGG GAA CTT TTT TCT CAA GCA GAA AGT 52 8 

Lys Arg Ser Gin Asp Arg He Arg Glu Leu Phe Ser Gin Ala Glu Ser 
165 170 175 

CAT TTT CGT AAT TCC ATG CCG TCA TTT GCA GTT TCC AAA TTC GAA GTG 576 
His Phe Arg Asn Ser Met Pro Ser Phe Ala Val Ser Lys Phe Glu Val 
180 185 190 
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CTG TTT CTA CCA ACA TAT GCA CAA GCT GCA AAT ACA CAT TTA TTG CTA 6 24 

Leu Phe Leu Pro Thr Tyr Ala Gin Ala Ala Asn Thr His Leu Leu Leu 
195 200 205 

TTA AAA GAT GCT CAA GTT TTT GGA GAA GAA TGG GGA TAT TCT TCA GAA 6 72 

Leu Lys Asp Ala Gin Val Phe Gly Glu Glu Trp Gly Tyr Ser Ser Glu 
210 215 220 

GAT GTT GCT GAA TTC TAT CGT AGA CAA TTA AAA CTT ACA CAA CAA TAC 72 0 

Asp Val Ala Glu Phe Tyr Arg Arg Gin Leu Lys Leu Thr Gin Gin Tyr 
225 230 235 240 

ACT GAC CAT TGT GTT AAT TGG TAT AAT GTT GGA TTA AAT GGT TTA AGA 768 
Thr Asp His Cys Val Asn Trp Tyr Asn Val Gly Leu Asn Gly Leu Arg 
245 250 255 

GGT TCA ACT TAT GAT GCA TGG GTC AAA TTT AAC CGT TTT CGC AGA GAA 816 
Gly Ser Thr Tyr Asp Ala Trp Val Lys Phe Asn Arg Phe Arg Arg Glu 
260 265 270 

ATG ACT TTA ACT GTA TTA GAT CTA ATT GTA CTT TTC CCA TTT TAT GAT 8 64 

Met Thr Leu Thr Val Leu Asp Leu lie Val Leu Phe Pro Phe Tyr. Asp 
275 280 285 

ATT CGG TTA TAC TCA AAA GGG GTT AAA ACA GAA CTA ACA AGA GAC ATT 912 
lie Arg Leu Tyr Ser Lys Gly Val Lys Thr Glu Leu Thr Arg Asp lie 
290 295 300 

TTT ACG GAT CCA ATT TTT TTA CTT ACT ACG CTT CAG AAG TAC GGA CCA 960 
Phe Thr Asp Pro lie Phe Leu Leu Thr Thr Leu Gin Lys Tyr Gly Pro 
305 310 315 320 

ACT TTT TTG AGT ATA GAA AAC TCT ATT CGA AAA CCT CAT TTA TTT GAT 10 08 

Thr Phe Leu Ser lie Glu Asn Ser lie Arg Lys Pro His Leu Phe Asp 
325 330 335 

TAT TTA CAG GGG ATT GAA TTT CAT ACG CGT CTT CAA CCT GGT TAC TTT 1056 
Tyr Leu Gin Gly lie Glu Phe His Thr Arg Leu Gin Pro Gly Tyr Phe 
340 345 350 

GGG AAA GAT TCT TTC AAT TAT TGG TCT GGT AAT TAT GTA GAA ACT AGA 1104 
Gly Lys Asp Ser Phe Asn Tyr Trp Ser Gly Asn Tyr Val Glu Thr Arg 
355 360 365 

CCT AGT ATA GGA TCT AGT AAG ACA ATT ACT TCC CCA TTT TAT GGA GAT 1152 
Pro Ser lie Gly Ser Ser Lys Thr lie Thr Ser Pro Phe Tyr Gly Asp 
370 375 380 

AAA TCT ACT GAA CCT GTA CAA AAG CTA AGC TTT GAT GGA CAA AAA GTT 12 00 

Lys Ser Thr Glu Pro Val Gin Lys Leu Ser Phe Asp Gly Gin Lys Val 
385 390 395 400 
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TAT CGA ACT ATA GCT AAT ACA GAC GTA GCG GCT TGG CCG AAT GGT AAG 1248 
Tyr Arg Thr He Ala Asn Thr Asp Val Ala Ala Trp Pro Asn Gly Lys 
405 410 415 

GTA TAT TTA GGT GTT ACG AAA GTT GAT TTT AGT CAA TAT GAT GAT CAA 12 96 

Val Tyr Leu Gly Val Thr Lys Val Asp Phe Ser Gin Tyr Asp Asp Gin 
420 , 425 430 

AAA AAT GAA ACT AGT ACA CAA ACA TAT GAT TCA AAA AGA AAC AAT GGC 1344 
Lys Asn Glu Thr Ser Thr Gin Thr Tyr Asp Ser Lys Arg Asn Asn Gly 
435 440 445 

CAT GTA AGT GCA CAG GAT TCT ATT GAC CAA TTA CCG CCA GAA ACA ACA 13 92 

His Val Ser Ala Gin Asp Ser He Asp Gin Leu Pro Pro Glu Thr Thr 
450 455 460 

GAT GAA CCA CTT GAA AAA GCA TAT AGT CAT CAG CTT AAT TAC GCG GAA 144 0 

Asp Glu Pro Leu Glu Lys Ala Tyr Ser His Gin Leu Asn Tyr Ala Glu 
465 470 475 480 

TGT TTC TTA ATG CAG GAC CGT CGT GGA ACA ATT CCA TTT TTT ACT TGG 14 88 

Cys Phe Leu Met Gin Asp Arg Arg Gly Thr He Pro Phe Phe Thr Trp 
485 490 495 

ACA CAT AGA AGT GTA GAC TTT TTT AAT ACA ATT GAT GCT GAA AAG ATT 1536 
Thr His Arg Ser Val Asp Phe Phe Asn Thr He Asp Ala Glu Lys lie 
500 505 510 

ACT CAA CTT CCA GTA GTG AAA GCA TAT GCC TTG TCT TCA GGT GCT TCC 1584 
Thr Gin Leu Pro Val Val Lys Ala Tyr Ala Leu Ser Ser Gly Ala Ser 
515 520 525 

ATT ATT GAA GGT CCA GGA TTC ACA GGA GGA AAT TTA CTA TTC CTA AAA 16 32 

He He Glu Gly Pro Gly Phe Thr Gly Gly Asn Leu Leu Phe Leu Lys 
530 535 540 

GAA TCT AGT AAT TCA ATT GCT AAA TTT AAA GTT ACA TTA AAT TCA GCA 16 80 

Glu Ser Ser Asn Ser He Ala Lys Phe Lys Val Thr Leu Asn Ser Ala 
545 550 555 560 

GCC TTG TTA CAA CGA TAT CGT GTA AGA ATA CGC TAT GCT TCT ACC ACT 172 8 

Ala Leu Leu Gin Arg Tyr Arg Val Arg lie Arg Tyr Ala Ser Thr Thr 
565 570 575 

AAC TTA CGA CTT TTT GTG CAA AAT TCA AAC AAT GAT TTT CTT GTC ATC 17 76 

Asn Leu Arg Leu Phe Val Gin Asn Ser Asn Asn Asp Phe Leu Val He 
580 585 590 

TAC ATT AAT AAA ACT ATG AAT AAA GAT GAT GAT TTA ACA TAT CAA ACA 18 24 

Tyr He Asn Lys Thr Met Asn Lys Asp Asp Asp Leu Thr Tyr Gin Thr 
595 600 605 
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TTT GAT CTC GCA ACT ACT AAT TCT AAT ATG GGG TTC TCG GGT GAT AAG 187 2 

Phe Asp Leu Ala Thr Thr Asn Ser Asn Met Gly Phe Ser Gly Asp Lys 
610 615 620 



AAT GAA CTT ATA ATA GGA GCA GAA TCT TTC GTT TCT AAT GAA AAA ATC 1920 
Asn Glu Leu lie lie Gly Ala Glu Ser Phe Val Ser Asn Glu Lys lie 
625 630 635 640 

TAT ATA GAT AAG ATA GAA TTT ATC CCA GTA CAA TTG TAA 1959 
Tyr lie Asp Lys lie Glu Phe lie Pro Val Gin Leu 
645 650 



(2) INFORMATION FOR SEQ ID NO: 22: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6 52 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 22: 

Met Asn Pro Asn Asn Arg Ser Glu His Asp Thr lie Lys Val Thr Pro 
15 10 15 

Asn Ser Glu Leu Gin Thr Asn His Asn Gin Tyr Pro Leu Ala Asp Asn 
20 25 30 

Pro Asn Ser Thr Leu Glu Glu Leu Asn Tyr Lys Glu Phe Leu Arg Met 
35 40 45 

Thr Glu Asp Ser Ser Thr Glu Val Leu Asp Asn Ser Thr Val Lys Asp 
50 55 . 60 

Ala Val Gly Thr Gly He Ser Val Val Gly Gin He Leu Gly Val Val 
65 70 75 80 

Gly Val Pro Phe Ala Gly Ala Leu Thr Ser Phe Tyr Gin Ser Phe Leu 
85 90 95 

Asn Thr He Trp Pro Ser Asp Ala Asp Pro Trp Lys Ala Phe Met Ala 
100 105 110 

Gin Val Glu Val Leu He Asp Lys Lys He Glu Glu Tyr Ala Lys Ser 
115 120 125 

Lys Ala Leu Ala Glu Leu Gin Gly Leu Gin Asn Asn Phe Glu Asp Tyr 
130 135 140 

Val Asn Ala Leu Asn Ser Trp Lys Lys Thr Pro Leu Ser Leu Arg Ser 
145 150 155 160 
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Lys Arg Ser Gin Asp Arg lie Arg Glu Leu Phe Ser Gin Ala Glu Ser 
165 170 175 



His Phe Arg Asn Ser Met Pro Ser Phe Ala Val Ser Lys Phe Glu Val 
180 185 190 

Leu Phe Leu Pro Thr Tyr Ala Gin Ala Ala Asn Thr His Leu Leu Leu 
195 200 205 

Leu Lys Asp Ala Gin Val Phe Gly Glu Glu Trp Gly Tyr Ser Ser Glu 
210 215 220 

Asp Val Ala Glu Phe Tyr Arg Arg Gin Leu Lys Leu Thr Gin Gin Tyr 
225 230 235 240 

Thr Asp His Cys Val Asn Trp Tyr Asn Val Gly Leu Asn Gly Leu Arg 
245 250 255 

Gly Ser Thr Tyr Asp Ala Trp Val Lys Phe Asn Arg Phe Arg Arg Glu 
260 265 270 

Met Thr Leu Thr Val Leu Asp Leu lie Val Leu Phe Pro Phe Tyr Asp 
275 280 285 

lie Arg Leu Tyr Ser Lys Gly Val Lys Thr Glu Leu Thr Arg Asp lie 
290 295 300 

Phe Thr Asp Pro lie Phe Leu Leu Thr Thr Leu Gin Lys Tyr Gly Pro 
305 310 315 320 

Thr Phe Leu Ser lie Glu Asn Ser lie Arg Lys Pro His Leu Phe Asp 
325 330 335 

Tyr Leu Gin Gly lie Glu Phe His Thr Arg Leu Gin Pro Gly Tyr Phe 
340 345 350 

Gly Lys Asp Ser Phe Asn Tyr Trp Ser Gly Asn Tyr Val Glu Thr Arg 
355 360 365 

Pro Ser He Gly Ser Ser Lys Thr He Thr Ser Pro Phe Tyr Gly Asp 
370 375 380 

Lys Ser Thr Glu Pro Val Gin Lys Leu Ser Phe Asp Gly Gin Lys Val 
385 390 395 400 

Tyr Arg Thr He Ala Asn Thr Asp Val Ala Ala Trp Pro Asn Gly Lys 
405 410 415 

Val Tyr Leu Gly Val Thr Lys Val Asp Phe Ser Gin Tyr Asp Asp Gin 
420 425 430 

Lys Asn Glu Thr Ser Thr Gin Thr Tyr Asp Ser Lys Arg Asn Asn Gly 
435 440 445 
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His Val Ser Ala Gin Asp Ser lie Asp Gin Leu Pro Pro Glu Thr Thr 
450 455 460 



Asp Glu Pro Leu Glu Lys Ala Tyr Ser His Gin Leu Asn Tyr Ala Glu 
465 470 475 480 

Cys Phe Leu Met Gin Asp Arg Arg Gly Thr lie Pro Phe Phe Thr Trp 
485 490 495 

Thr His Arg Ser Val Asp Phe Phe Asn Thr lie Asp Ala Glu Lys lie 
500 505 510 

Thr Gin Leu Pro Val Val Lys Ala Tyr Ala Leu Ser Ser Gly Ala Ser 
515 520 525 

lie lie Glu Gly Pro Gly Phe Thr Gly Gly Asn Leu Leu Phe Leu Lys 
530 535 540 

Glu Ser Ser Asn Ser lie Ala Lys Phe Lys Val Thr Leu Asn Ser Ala 
545 550 555 560 

Ala Leu Leu Gin Arg Tyr Arg Val Arg lie Arg Tyr Ala Ser Thr Thr 
565 570 575 

Asn Leu Arg Leu Phe Val Gin Asn Ser Asn Asn Asp Phe Leu Val lie 
580 585 590 

Tyr lie Asn Lys Thr Met Asn Lys Asp Asp Asp Leu Thr Tyr Gin Thr 
595 600 605 

Phe Asp Leu Ala Thr Thr Asn Ser Asn Met Gly Phe Ser- Gly Asp Lys 
610 615 620 

Asn Glu Leu lie lie Gly Ala Glu Ser Phe Val Ser Asn Glu Lys lie 
625 630 635 640 



Tyr He Asp Lys He Glu Phe He Pro Val Gin Leu 
645 650 



(2) INFORMATION FOR SEQ ID NO: 23: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1959 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ix) FEATURE: 

(A) NAME /KEY : CDS 

(B) LOCATION: 1. .1956 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 23: 



ATG AAT CCA AAC AAT CGA AGT GAA CAT GAT ACG ATA AAG GTT ACA CCT 4 8 

Met Asn Pro Asn Asn Arg Ser Glu His Asp Thr lie Lys Val Thr Pro 
15 10 15 

AAC AGT GAA TTG CAA ACT AAC CAT AAT CAA TAT CCT TTA GCT GAC AAT 96 
Asn Ser Glu Leu Gin Thr Asn His Asn Gin Tyr Pro Leu Ala Asp Asn 
20 25 30 

CCA AAT TCA ACA CTA GAA GAA TTA AAT TAT AAA GAA TTT TTA AGA ATG 14 4 

Pro Asn Ser Thr Leu Glu Glu Leu Asn Tyr Lys Glu Phe Leu Arg Met 
35 40 45 

ACT GAA GAC AGT TCT ACG GAA GTG CTA GAC AAC TCT ACA GTA AAA GAT 192 
Thr Glu Asp Ser Ser Thr Glu Val Leu Asp Asn Ser Thr Val Lys Asp 
50 55 60 

GCA GTT GGG ACA GGA ATT TCT GTT GTA GGG CAG ATT TTA GGT GTT GTA 240 
Ala Val Gly Thr Gly He Ser Val Val Gly Gin He Leu Gly Val Val 
65 70 75 80 

GGA GTT CCA TTT GCT GGG GCA CTC ACT TCA TTT TAT CAA TCA TTT CTT 288 
Gly Val Pro Phe Ala Gly Ala Leu Thr Ser Phe Tyr Gin Ser Phe Leu 
85 90 95 

AAC ACT ATA TGG CCA AGT GAT GCT GAC CCA TGG AAG GCT TTT ATG GCA 336 
Asn Thr He Trp Pro Ser Asp Ala Asp Pro Trp Lys Ala Phe Met Ala 
100 105 110 

CAA GTT GAA GTA CTG ATA GAT AAG AAA ATA GAG GAG TAT GCT AAA AGT 384 
Gin Val Glu Val Leu He Asp Lys Lys He Glu Glu Tyr Ala Lys Ser 
115 120 125 

AAA GCT CTT GCA GAG TTA CAG GGT CTT CAA AAT AAT TTC GAA GAT TAT 432 
Lys Ala Leu Ala Glu Leu Gin Gly Leu Gin Asn Asn Phe Glu Asp Tyr 
130 135 140 

GTT AAT GCG TTA AAT TCC TGG AAG AAA ACA CCT TTA AGT TTG CGA AGT 480 
Val Asn Ala Leu Asn Ser Trp Lys Lys Thr Pro Leu Ser Leu Arg Ser 
145 150 155 160 

AAA AGA AGC CAA GAT CGA ATA AGG GAA CTT TTT TCT CAA GCA GAA AGT 52 8 

Lys Arg Ser Gin Asp Arg He Arg Glu Leu Phe Ser Gin Ala Glu Ser 
165 170 175 

CAT TTT CGT AAT TCC ATG CCG TCA TTT GCA GTT TCC AAA TTC GAA GTG 576 
His Phe Arg Asn Ser Met Pro Ser Phe Ala Val Ser Lys Phe Glu Val 
180 185 190 

CTG TTT CTA CCA ACA TAT GCA CAA GCT GCA AAT ACA CAT TTA TTG CTA 624 
Leu Phe Leu Pro Thr Tyr Ala Gin Ala Ala Asn Thr His Leu Leu Leu 
195 200 205 
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TTA AAA GAT GCT CAA GTT TTT GGA GAA GAA TGG GGA TAT TCT TCA GAA 6 72 

Leu Lys Asp Ala Gin Val Phe Gly Glu Glu Trp Gly Tyr Ser Ser Glu 
210 215 220 

GAT GTT GCT GAA TTT TAT CAT AGA CAA TTA AAA CTT ACA CAA CAA TAC 72 0 

Asp Val Ala Glu Phe Tyr His Arg Gin Leu Lys Leu Thr Gin Gin Tyr 
225 230 235 240 

ACT GAC CAT TGT GTT AAT TGG TAT AAT GTT GGA TTA AAT GGT TTA AGA 76 8 

Thr Asp His Cys Val Asn Trp Tyr Asn Val Gly Leu Asn Gly Leu Arg 
245 250 255 

GGT TCA ACT TAT GAT GCA TGG GTC AAA TTT AAC CGT TTT CGC AGA GAA 816 
Gly Ser Thr Tyr Asp Ala Trp Val Lys Phe Asn Arg Phe Arg Arg Glu 
260 265 270 

ATG ACT TTA ACT GTA TTA GAT CTA ATT GTA CTT TTC CCA TTT TAT GAT 864 
Met Thr Leu Thr Val Leu Asp Leu lie Val Leu Phe Pro Phe Tyr Asp 
275 280 285 

ATT CGG TTA TAC TCA AAA GGG GTT AAA ACA GAA CTA ACA AGA GAC ATT 912 
lie Arg Leu Tyr Ser Lys Gly Val Lys Thr Glu Leu Thr Arg Asp lie 
290 295 300 

TTT ACG GAT CCA ATT TTT ACG CCA ACC ACC CTA CAG GAT TAC GGA CCA 96 0 

Phe Thr Asp Pro lie Phe Thr Pro Thr Thr Leu Gin Asp Tyr Gly Pro 
305 310 315 320 

ACT TTT TTG AGT ATA GAA AAC TCT ATT CGA AAA CCT CAT" TTA TTT GAT 1008 
Thr Phe Leu Ser lie Glu Asn Ser lie Arg Lys Pro His Leu Phe Asp 
325 330 335 

TAT TTA CAG GGG ATT GAA TTT CAT ACG CGT CTT CAA CCT GGT TAC TTT 1056 
Tyr Leu Gin Gly lie Glu Phe His Thr Arg Leu Gin Pro Gly . Tyr Phe 
340 345 350 

GGG AAA GAT TCT TTC AAT TAT TGG TCT GGT AAT TAT GTA GAA ACT AGA 1104 
Gly Lys Asp Ser Phe Asn Tyr Trp Ser Gly Asn Tyr Val Glu Thr Arg 
355 360 365 

CCT AGT ATA GGA TCT AGT AAG ACA ATT ACT TCC CCA TTT TAT GGA GAT 1152 
Pro Ser lie Gly Ser Ser Lys Thr lie Thr Ser Pro Phe Tyr Gly Asp 
370 375 380 

AAA TCT ACT GAA CCT GTA CAA AAG CTA AGC TTT GAT GGA CAA AAA GTT 1200 
Lys Ser Thr Glu Pro Val Gin Lys Leu Ser Phe Asp Gly Gin Lys Val 
385 390 395 400 

TAT CGA ACT ATA GCT AAT ACA GAC GTA GCG GCT TGG CCG AAT GGT AAG 124 8 

Tyr Arg Thr lie Ala Asn Thr Asp Val Ala Ala Trp Pro Asn Gly Lys 
405 410 415 



A I3S535(2WKV0I» DOC) 



-275- 



GTA TAT TTA GGT GTT ACG AAA GTT GAT TTT AGT CAA TAT GAT GAT CAA 12 96 

Val Tyr Leu Gly Val Thr Lys Val Asp Phe Ser Gin Tyr Asp Asp Gin 
420 425 430 

AAA AAT GAA ACT AGT AC A CAA ACA TAT GAT TCA AAA AGA AAC AAT' GGC 1344 
Lys Asn Glu Thr Ser Thr Gin Thr Tyr Asp Ser Lys Arg Asn Asn Gly 
435 440 445 

CAT GTA AGT GCA CAG GAT TCT ATT GAC CAA TTA CCG CCA GAA ACA ACA 13 92 

His Val Ser Ala Gin Asp Ser lie Asp Gin Leu Pro Pro Glu Thr Thr 
450 455 460 

GAT GAA CCA CTT GAA AAA GCA TAT AGT CAT CAG CTT AAT TAC GCG GAA 144 0 

Asp Glu Pro Leu Glu Lys Ala Tyr Ser His Gin Leu Asn Tyr Ala Glu 
465 470 475 480 

TGT TTC TTA ATG CAG GAC CGT CGT GGA ACA ATT CCA TTT TTT ACT TGG 1488 
Cys Phe Leu Met Gin Asp Arg Arg Gly Thr lie Pro Phe Phe Thr Trp 
485 490 495 

ACA CAT AGA AGT GTA GAC TTT TTT AAT ACA ATT GAT GCT GAA AAG ATT 1536 
Thr His Arg Ser Val Asp Phe Phe Asn Thr lie Asp Ala Glu Lys lie 
500 505 510 

ACT CAA CTT CCA GTA GTG AAA GCA TAT GCC TTG TCT TCA GGT GCT TCC 1584 
Thr Gin Leu Pro Val Val Lys Ala Tyr Ala Leu Ser Ser Gly Ala Ser 
515 520 525 

ATT ATT GAA GGT CCA GGA TTC ACA GGA GGA AAT TTA CTA TTC CTA AAA 1632 
lie lie Glu Gly Pro Gly Phe Thr Gly Gly Asn Leu Leu Phe Leu Lys 
530 535 540 

GAA TCT AGT AAT TCA ATT GCT AAA TTT AAA GTT ACA TTA AAT TCA GCA 1680 
Glu Ser Ser Asn Ser He Ala Lys Phe Lys Val Thr Leu Asn Ser Ala 
545 550 555 560 

GCC TTG TTA CAA CGA TAT CGT GTA AGA ATA CGC TAT GCT TCT ACC ACT 1728 
Ala Leu Leu Gin Arg Tyr Arg Val Arg He Arg Tyr Ala Ser Thr Thr 
565 570 575 

AAC TTA CGA CTT TTT GTG CAA AAT TCA AAC AAT GAT TTT CTT GTC ATC 1776 
Asn Leu Arg Leu Phe Val Gin Asn Ser Asn Asn Asp Phe Leu Val He 
580 585 590 

TAC ATT AAT AAA ACT ATG AAT AAA GAT GAT GAT TTA ACA TAT CAA ACA 1824 
Tyr He Asn Lys Thr Met Asn Lys Asp Asp Asp Leu Thr Tyr Gin Thr 
595 600 605 

TTT GAT CTC GCA ACT ACT AAT TCT AAT ATG GGG TTC TCG GGT GAT AAG 1872 
Phe Asp Leu Ala Thr Thr Asn Ser Asn Met Gly Phe Ser Gly Asp Lys 
610 615 620 
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AAT GAA CTT ATA ATA GGA GCA GAA TCT TTC GTT TCT AAT GAA AAA ATC 
Asn Glu Leu He He Gly Ala Glu Ser Phe Val Ser Asn Glu Lys He 
625 630 635 640 



TAT ATA GAT AAG ATA GAA TTT ATC CCA GTA CAA TTG TAA 
Tyr He Asp Lys He Glu Phe He Pro Val Gin Leu 
645 650 



(2) INFORMATION FOR SEQ ID NO: 24: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6 52 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 24: 

Met Asn Pro Asn Asn Arg Ser Glu His Asp Thr He Lys Val Thr Pro 
15 10 15 

Asn Ser Glu Leu Gin Thr Asn His Asn Gin Tyr Pro Leu Ala Asp Asn 
20 25 30 

Pro Asn Ser Thr Leu Glu Glu Leu Asn Tyr Lys Glu Phe Leu Arg Met 
35 40 45 

Thr Glu Asp Ser Ser Thr Glu Val Leu Asp Asn Ser Thr Val Lys Asp 
50 55 60 

Ala Val Gly Thr Gly He Ser Val Val Gly Gin He Leu Gly Val Val 
65 70 75 80 

Gly Val Pro Phe Ala Gly Ala Leu Thr Ser Phe Tyr Gin Ser Phe Leu 
85 90 95 

Asn Thr He Trp Pro Ser Asp Ala Asp Pro Trp Lys Ala Phe Met Ala 
100 105 110 

Gin Val Glu Val Leu He Asp Lys Lys He Glu Glu Tyr Ala Lys Ser 
115 120 125 

Lys Ala Leu Ala Glu Leu Gin Gly Leu Gin Asn Asn Phe Glu Asp Tyr 
130 135 140 

Val Asn Ala Leu Asn Ser Trp Lys Lys Thr Pro Leu Ser Leu Arg Ser 
145 150 155 160 

Lys Arg Ser Gin Asp Arg He Arg Glu Leu Phe Ser Gin Ala Glu Ser 
165 170 175 
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His Phe Arg Asn Ser Met Pro Ser Phe Ala Val Ser Lys Phe Glu Val 
180 185 190 



Leu Phe Leu Pro Thr Tyr Ala Gin Ala Ala Asn Thr His Leu Leu Leu 
195 200 205 

Leu Lys Asp Ala Gin Val Phe Gly Glu Glu Trp Gly Tyr Ser Ser Glu 
210 215 220 

Asp Val Ala Glu Phe Tyr His Arg Gin Leu Lys Leu Thr Gin Gin Tyr 
225 230 235 240 

Thr Asp His Cys Val Asn Trp Tyr Asn Val Gly Leu Asn Gly Leu Arg 
245 250 255 

Gly Ser Thr Tyr Asp Ala Trp Val Lys Phe Asn Arg Phe Arg Arg Glu 
260 265 270 

Met Thr Leu Thr Val Leu Asp Leu lie Val Leu Phe Pro Phe Tyr Asp 
275 280 285 

lie Arg Leu Tyr Ser Lys Gly Val Lys Thr Glu Leu Thr Arg Asp lie 
290 . 295 300 

Phe Thr Asp Pro lie Phe Thr Pro Thr Thr Leu Gin Asp Tyr Gly Pro 
305 310 315 320 

Thr Phe Leu Ser lie Glu Asn Ser lie Arg Lys Pro His Leu Phe Asp 
325 330 335 

Tyr Leu Gin Gly He Glu Phe His Thr Arg Leu Gin Pro Gly Tyr Phe 
340 345 350 

Gly Lys Asp Ser Phe Asn Tyr Trp Ser Gly Asn Tyr Val Glu Thr Arg 
355 360 365 

Pro Ser He Gly Ser Ser Lys Thr He Thr Ser Pro Phe Tyr Gly Asp 
370 375 380 

Lys Ser Thr Glu Pro Val Gin Lys Leu Ser Phe Asp Gly Gin Lys Val 
385 390 395 400 

Tyr Arg Thr He Ala Asn Thr Asp Val Ala Ala Trp Pro Asn Gly Lys 
405 410 415 

Val Tyr Leu Gly Val Thr Lys Val Asp Phe Ser Gin Tyr Asp Asp Gin 
420 425 430 

Lys Asn Glu Thr Ser Thr Gin Thr Tyr Asp Ser Lys Arg Asn Asn Gly 
435 440 445 

His Val Ser Ala Gin Asp Ser He Asp Gin Leu Pro Pro Glu Thr Thr 
450 455 460 
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Asp Glu Pro Leu Glu Lys Ala Tyr Ser His Gin Leu Asn Tyr Ala Glu 
465 470 475 480 



Cys Phe Leu Met Gin Asp Arg Arg Gly Thr lie Pro Phe Phe Thr Trp 
485 490 495 

Thr His Arg Ser Val Asp Phe Phe Asn Thr lie Asp Ala Glu Lys lie 
500 505 510 

Thr Gin Leu Pro Val Val Lys Ala Tyr Ala Leu Ser Ser Gly Ala Ser 
515 520 525 

lie lie Glu Gly Pro Gly Phe Thr Gly Gly Asn Leu Leu Phe Leu Lys 
530 535 540 

Glu Ser Ser Asn Ser lie Ala Lys Phe Lys Val Thr Leu Asn Ser Ala 
545 550 555 560 

Ala Leu Leu Gin Arg Tyr Arg Val Arg lie Arg Tyr Ala Ser Thr Thr 
565 570 575 

Asn Leu Arg Leu Phe Val Gin Asn Ser Asn Asn Asp Phe Leu Val lie 
580 585 590 

Tyr lie Asn Lys Thr Met Asn Lys Asp Asp Asp Leu Thr Tyr Gin Thr 
595 600 605 

Phe Asp Leu Ala Thr Thr Asn Ser Asn Met Gly Phe Ser Gly Asp Lys 
610 615 620 

Asn Glu Leu lie lie Gly Ala Glu Ser Phe Val Ser Asn Glu Lys lie 
625 630 635 640 

Tyr lie Asp Lys lie Glu Phe lie Pro Val Gin Leu 
645 650 



(2) INFORMATION FOR SEQ ID NO: 25: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 1959 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ix) FEATURE: 

(A) NAME /KEY : CDS 

(B) LOCATION: 1. .1956 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:25: 

ATG AAT CCA AAC AAT CGA AGT GAA CAT GAT ACG ATA AAG GTT ACA CCT 
Met Asn Pro Asn Asn Arg Ser Glu His Asp Thr lie Lys Val Thr Pro 
15 10 15 
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AAC AGT GAA TTG CAA ACT AAC CAT AAT CAA TAT CCT TTA GCT GAC AAT 96 
Asn Ser Glu Leu Gin Thr Asn His Asn Gin Tyr Pro Leu Ala Asp Asn 
20 25 30 

CCA AAT TCA ACA CTA GAA GAA TTA AAT TAT AAA GAA TTT TTA AGA ATG 144 
Pro Asn Ser Thr Leu Glu Glu Leu Asn Tyr Lys Glu Phe Leu Arg Met 
35 40 45 

ACT GAA GAC AGT TCT ACG GAA GTG CTA GAC AAC TCT ACA GTA AAA GAT 192 
Thr Glu Asp Ser Ser Thr Glu Val Leu Asp Asn Ser Thr Val Lys Asp 
50 55 60 

GCA GTT GGG ACA GGA ATT TCT GTT GTA GGG CAG ATT TTA GGT GTT GTA 240 
Ala Val Gly Thr Gly He Ser Val Val Gly Gin He Leu Gly Val Val 
65 70 75 80 

GGA GTT CCA TTT GCT GGG GCA CTC ACT TCA TTT TAT CAA TCA TTT CTT 288 
Gly Val Pro Phe Ala Gly Ala Leu Thr Ser Phe Tyr Gin Ser Phe Leu 
85 90 95 

AAC ACT ATA TGG CCA AGT GAT GCT GAC CCA TGG AAG GCT TTT ATG GCA 3 36 

Asn Thr He Trp Pro Ser Asp Ala Asp Pro Trp Lys Ala Phe Met Ala 
100 105 110 

CAA GTT GAA GTA CTG ATA GAT AAG AAA ATA GAG GAG TAT GCT AAA AGT 384 
Gin Val Glu Val Leu He Asp Lys Lys He Glu Glu Tyr Ala Lys Ser 
115 120 125 

AAA GCT CTT GCA GAG TTA CAG GGT CTT CAA AAT AAT TTC GAA GAT TAT 432 
Lys Ala Leu Ala Glu Leu Gin Gly Leu Gin Asn Asn Phe Glu Asp Tyr 
130 135 140 

GTT AAT GCG TTA AAT TCC TGG AAG AAA ACA CCT TTA AGT TTG CGA AGT 480 
Val Asn Ala Leu Asn Ser Trp Lys Lys Thr Pro Leu Ser Leu Arg Ser 
145 150 155 160 

AAA AGA AGC CAA GAT CGA ATA AGG GAA CTT TTT TCT CAA GCA GAA AGT 528 
Lys Arg Ser Gin Asp Arg He Arg Glu Leu Phe Ser Gin Ala Glu Ser 
165 170 175 

CAT TTT CGT AAT TCC ATG CCG TCA TTT GCA GTT TCC AAA TTC GAA GTG 576 
His Phe Arg Asn Ser Met Pro Ser Phe Ala Val Ser Lys Phe Glu Val 
180 185 190 

CTG TTT CTA CCA ACA TAT GCA CAA GCT GCA AAT ACA CAT TTA TTG CTA 624 
Leu Phe Leu Pro Thr Tyr Ala Gin Ala Ala Asn Thr His Leu Leu Leu 
195 200 205 

TTA AAA GAT GCT CAA GTT TTT GGA GAA GAA TGG GGA TAT TCT TCA GAA 672 
Leu Lys Asp Ala Gin Val Phe Gly Glu Glu Trp Gly Tyr Ser Ser Glu 
210 215 220 
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GAT GTT GCT GAA TTT TAT CAT AGA CAA TTA AAA CTT ACA CAA CAA TAC 720 
Asp Val Ala Glu Phe Tyr His Arg Gin Leu Lys Leu Thr Gin Gin Tyr 
225 230 235 240 

ACT GAC CAT TGT GTT AAT TGG TAT AAT GTT GGA TTA AAT GGT TTA AGA 768 
Thr Asp His Cys Val Asn Trp Tyr Asn Val Gly Leu Asn Gly Leu Arg 
245 250 255 

GGT TCA ACT TAT GAT GCA TGG GTC AAA TTT AAC CGT TTT CGC AGA GAA 816 
Gly Ser Thr Tyr Asp Ala Trp Val Lys Phe Asn Arg Phe Arg Arg Glu 
260 265 270 

ATG ACT TTA ACT GTA TTA GAT CTA ATT GTA CTT TTC CCA TTT TAT GAT 864 
Met Thr Leu Thr Val Leu Asp Leu lie Val Leu Phe Pro Phe Tyr Asp 
275 280 285 

ATT CGG TTA TAC TCA AAA GGG GTT AAA ACA GAA CTA ACA AGA GAC ATT 912 
lie Arg Leu Tyr Ser Lys Gly Val Lys Thr Glu Leu Thr Arg Asp lie 
290 295 300 

TTT ACG GAT CCA ATT TTT GCC CTG AAT ACC TTA GAC GAG TAC GGA CCA 960 
Phe Thr Asp Pro He Phe Ala Leu Asn Thr Leu Asp- Glu Tyr Gly Pro 
305 310 315 320 

ACT TTT TTG AGT ATA GAA AAC TCT ATT CGA AAA CCT CAT TTA TTT GAT 1008 
Thr Phe Leu Ser He Glu Asn Ser He Arg Lys Pro His Leu Phe Asp 
325 330 335 

TAT TTA CAG GGG ATT GAA TTT CAT ACG CGT CTT CAA CCT GGT TAC TTT 1056 
Tyr Leu Gin Gly He Glu Phe His Thr Arg Leu Gin Pro Gly Tyr Phe 
340 345 350 

GGG AAA GAT TCT TTC AAT TAT TGG TCT GGT AAT TAT GTA GAA ACT AGA 1104 
Gly Lys Asp Ser Phe Asn Tyr Trp Ser Gly Asn Tyr Val Glu Thr Arg 
355 360 365 

CCT AGT ATA GGA TCT AGT AAG ACA ATT ACT TCC CCA TTT TAT GGA GAT 1152 
Pro Ser He Gly Ser Ser Lys Thr He Thr Ser Pro Phe Tyr Gly Asp 
370 375 380 

AAA TCT ACT GAA CCT GTA CAA AAG CTA AGC TTT GAT GGA CAA AAA GTT 1200 
Lys Ser Thr Glu Pro Val Gin Lys Leu Ser Phe Asp Gly Gin Lys Val 
385 390 395 400 

TAT CGA ACT ATA GCT AAT ACA GAC GTA GCG GCT TGG CCG AAT GGT AAG 1248 
Tyr Arg Thr He Ala Asn Thr Asp Val Ala Ala Trp Pro Asn Gly Lys 
405 410 415 

GTA TAT TTA GGT GTT ACG AAA GTT GAT TTT AGT CAA TAT GAT GAT CAA 12 96 

Val Tyr Leu Gly Val Thr Lys Val Asp Phe Ser Gin Tyr Asp Asp Gin 
420 425 430 

AAA AAT GAA ACT AGT ACA CAA ACA TAT GAT TCA AAA AGA AAC AAT GGC 1344 
Lys Asn Glu Thr Ser Thr Gin Thr Tyr Asp Ser Lys Arg Asn Asn Gly 
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435 



440 



445 



CAT GTA AGT GCA CAG GAT TCT ATT GAC CAA TTA CCG CCA GAA ACA ACA 13 92 

His Val Ser Ala Gin Asp Ser lie Asp Gin Leu Pro Pro Glu Thr Thr 
450 455 460 

GAT GAA CCA CTT GAA AAA GCA TAT AGT CAT CAG CTT AAT TAC GCG GAA 1440 
Asp Glu Pro Leu Glu Lys Ala Tyr Ser His Gin Leu Asn Tyr Ala Glu 
465 470 475 480 

TGT TTC TTA ATG CAG GAC CGT CGT GGA ACA ATT CCA TTT TTT ACT TGG 14 88 

Cys Phe Leu Met Gin Asp Arg Arg Gly Thr lie Pro Phe Phe Thr Trp 
485 490 495 

ACA CAT AGA AGT GTA GAC TTT TTT AAT ACA ATT GAT GCT GAA AAG ATT 153 6 

Thr His Arg Ser Val Asp Phe Phe Asn Thr lie Asp Ala Glu Lys lie 
500 505 510 

ACT CAA CTT CCA GTA GTG AAA GCA TAT GCC TTG TCT TCA GGT GCT TCC 1584 
Thr Gin Leu Pro Val Val Lys Ala Tyr Ala Leu Ser Ser Gly Ala Ser 
515 520 525 

ATT ATT GAA GGT CCA GGA TTC ACA GGA GGA AAT TTA CTA TTC CTA AAA 1632 
lie lie Glu Gly Pro Gly Phe Thr Gly Gly Asn Leu Leu Phe Leu Lys 
530 535 540 

GAA TCT AGT AAT TCA ATT GCT AAA TTT AAA GTT ACA TTA AAT TCA GCA 1680 
Glu Ser Ser Asn Ser lie Ala Lys Phe Lys Val Thr Leu Asn Ser Ala 
545 550 555 560 

GCC TTG TTA CAA CGA TAT CGT GTA AGA ATA CGC TAT GCT TCT ACC ACT 172 8 

Ala Leu Leu Gin Arg Tyr Arg Val Arg lie Arg Tyr Ala Ser Thr Thr 
565 570 575 

AAC TTA CGA CTT TTT GTG CAA AAT TCA AAC AAT GAT TTT CTT GTC ATC 1776 
Asn Leu Arg Leu Phe Val Gin Asn Ser Asn Asn Asp Phe Leu Val lie 
580 585 590 

TAC ATT AAT AAA ACT ATG AAT AAA GAT GAT GAT TTA ACA TAT CAA ACA 1824 
Tyr lie Asn Lys Thr Met Asn Lys Asp Asp Asp Leu Thr Tyr Gin Thr 
595 600 605 

TTT GAT CTC GCA ACT ACT AAT TCT AAT ATG GGG TTC TCG GGT GAT AAG 1872 
Phe Asp Leu Ala Thr Thr Asn Ser Asn Met Gly Phe Ser Gly Asp Lys 
610 615 620 

AAT GAA CTT ATA ATA GGA GCA GAA TCT TTC GTT TCT AAT GAA AAA ATC 192 0 

Asn Glu Leu lie lie Gly Ala Glu Ser Phe Val Ser Asn Glu Lys He 
625 630 ' 635 640 

TAT ATA GAT AAG ATA GAA TTT ATC CCA GTA CAA TTG TAA 1959 
Tyr He Asp Lys He Glu Phe He Pro Val Gin Leu 
645 650 
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(2) INFORMATION FOR SEQ ID NO: 26: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 652 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE : protein 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 26: 

Met Asn Pro Asn Asn Arg Ser Glu His Asp Thr lie Lys Val Thr Pro 
15 10 15 

Asn Ser Glu Leu Gin Thr Asn His Asn Gin Tyr Pro Leu Ala Asp Asn 
20 25 30 

Pro Asn Ser Thr Leu Glu Glu Leu Asn Tyr Lys Glu Phe Leu Arg Met 
35 40 45 

Thr Glu Asp Ser Ser Thr Glu Val Leu Asp Asn Ser Thr Val Lys Asp 
50 55 60 

Ala Val Gly Thr Gly He Ser Val Val Gly Gin He Leu Gly Val Val 
65 70 75 80 

Gly Val Pro Phe Ala Gly Ala Leu Thr Ser Phe Tyr Gin Ser Phe Leu 
85 90 95 

Asn Thr He Trp Pro Ser Asp Ala Asp Pro Trp Lys Ala Phe Met Ala 
100 105 110 

Gin Val Glu Val Leu He Asp Lys Lys He Glu Glu Tyr Ala Lys Ser 
115 120 125 

Lys Ala Leu Ala Glu Leu Gin Gly Leu Gin Asn Asn Phe Glu Asp Tyr 
130 135 140 

Val Asn Ala Leu Asn Ser Trp Lys Lys Thr Pro Leu Ser Leu Arg Ser 
145 150 155 160 

Lys Arg Ser Gin Asp Arg He Arg Glu Leu Phe Ser Gin Ala Glu Ser 
165 170 175 

His Phe Arg Asn Ser Met Pro Ser Phe Ala Val Ser Lys Phe Glu Val 
180 185 190 

Leu Phe Leu Pro Thr Tyr Ala Gin Ala Ala Asn Thr His Leu Leu Leu 
195 200 205 

Leu Lys Asp Ala Gin Val Phe Gly Glu Glu Trp Gly Tyr Ser Ser Glu 
210 215 220 
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Asp Val Ala Glu Phe Tyr His Arg Gin Leu Lys Leu Thr Gin Gin Tyr 
225 230 235 240 



Thr Asp His Cys Val Asn Trp Tyr Asn Val Gly Leu Asn Gly Leu Arg 
245 250 255 

Gly Ser Thr Tyr Asp Ala Trp Val Lys Phe Asn Arg Phe Arg Arg Glu 
260 265 270 

Met Thr Leu Thr Val Leu Asp Leu lie Val Leu Phe Pro Phe Tyr Asp 
275 280 285 

lie Arg Leu Tyr Ser Lys Gly Val Lys Thr Glu Leu Thr Arg Asp lie 
290 295 300 

Phe Thr Asp Pro lie Phe Ala Leu Asn Thr Leu Asp Glu Tyr Gly Pro 
305 310 315 320 

Thr Phe Leu Ser lie Glu Asn Ser lie Arg Lys Pro His Leu Phe Asp 
325 330 335 

Tyr Leu Gin Gly lie Glu Phe His Thr Arg Leu Gin Pro Gly Tyr Phe 
340 345 350 

Gly Lys Asp Ser Phe Asn Tyr Trp Ser Gly Asn Tyr Val Glu Thr Arg 
355 360 365 

Pro Ser lie Gly Ser Ser Lys Thr lie Thr Ser Pro Phe Tyr Gly Asp 
370 375 380 

Lys Ser Thr Glu Pro Val Gin Lys Leu Ser Phe Asp Gly Gin Lys Val 
385 390 395 400 

Tyr Arg Thr lie Ala Asn Thr Asp Val Ala Ala Trp Pro Asn Gly Lys 
405 410 415 

Val Tyr Leu Gly Val Thr Lys Val Asp Phe Ser Gin Tyr Asp Asp Gin 
420 425 430 

Lys Asn Glu Thr Ser Thr Gin Thr Tyr Asp Ser Lys Arg Asn Asn Gly 
435 440 445 

His Val Ser Ala Gin Asp Ser lie Asp Gin Leu Pro Pro Glu Thr Thr 
450 455 460 

Asp Glu Pro Leu Glu Lys Ala Tyr Ser His Gin Leu Asn Tyr Ala Glu 
465 470 475 480 

Cys Phe Leu Met Gin Asp Arg Arg Gly Thr lie Pro Phe Phe Thr Trp 
485 490 495 

Thr His Arg Ser Val Asp Phe Phe Asn Thr lie Asp Ala Glu Lys He 
500 505 510 
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Thr Gin Leu Pro 
515 

lie He Glu Gly 
530 

Glu Ser Ser Asn 
545 

Ala Leu Leu Gin 



Asn Leu Arg Leu 
580 

Tyr He Asn Lys 
595 

Phe Asp Leu Ala 
610 

Asn Glu Leu He 
625 

Tyr He Asp Lys 



Val Val Lys Ala 
520 

Pro Gly Phe Thr 
535 

Ser He Ala Lys 
550 

Arg Tyr Arg Val 
565 

Phe Val Gin Asn 



Thr Met Asn Lys 
600 

Thr Thr Asn Ser 
615 

He Gly Ala Glu 
630 

He Glu Phe He 
645 



Tyr Ala Leu Ser 



Gly Gly Asn Leu 
540 

Phe Lys Val Thr 
555 

Arg He Arg Tyr 
570 

Ser Asn Asn Asp 
585 

Asp Asp Asp Leu 



Asn Met Gly Phe 
620 

Ser Phe Val Ser 
635 

Pro Val Gin Leu 
650 



Ser Gly Ala Ser 
525 

Leu Phe Leu Lys 



Leu Asn Ser Ala 
560 

Ala Ser Thr Thr 
575 

Phe Leu Val He 
590 

Thr Tyr Gin Thr 
605 

Ser Gly Asp Lys 



Asn Glu Lys He 
640 



(2) INFORMATION FOR SEQ ID NO: 27: 

(i) SEQUENCE CHARACTERISTICS: 

JA) LENGTH: 1959 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ix) FEATURE: 

(A) NAME/ KEY: CDS 

(B) LOCATION: 1 . . 1956 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 27: 

ATG AAT CCA AAC AAT CGA AGT GAA CAT GAT ACG ATA AAG GTT ACA CCT 4 8 

Met Asn Pro Asn Asn Arg Ser Glu His Asp Thr He Lys Val Thr Pro 
15 10 15 

AAC AGT GAA TTG CAA ACT AAC CAT AAT CAA TAT CCT TTA GCT GAC AAT 96 
Asn Ser Glu Leu Gin Thr Asn His Asn Gin Tyr Pro Leu Ala Asp Asn 
20 25 30 

CCA AAT TCA ACA CTA GAA GAA TTA AAT TAT AAA GAA TTT TTA AGA ATG 144 
Pro Asn Ser Thr Leu Glu Glu Leu Asn Tyr Lys Glu Phe Leu Arg Met 
35 40 45 
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ACT GAA GAC AGT TCT ACG GAA GTG CTA GAC AAC TCT ACA GTA AAA GAT 
Thr Glu Asp Ser Ser Thr Glu Val Leu Asp Asn Ser Thr Val Lys Asp 
50 55 60 



192 



GCA GTT GGG ACA GGA ATT TCT GTT GTA GGG CAG ATT TTA GGT GTT GTA 24 0 

Ala Val Gly Thr Gly He Ser Val Val Gly Gin He Leu Gly Val Val 
65 70 75 80 

GGA GTT CCA TTT GCT GGG GCA CTC ACT TCA TTT TAT CAA TCA TTT CTT 288 
Gly Val Pro Phe Ala Gly Ala Leu Thr Ser Phe Tyr Gin Ser Phe Leu 
85 90 95 

AAC ACT ATA TGG CCA AGT GAT GCT GAC CCA TGG AAG GCT TTT ATG GCA 3 36 

Asn Thr He Trp Pro Ser Asp Ala Asp Pro Trp Lys Ala Phe Met Ala 
100 105 110 

CAA GTT GAA GTA CTG ATA GAT AAG AAA ATA GAG GAG TAT GCT AAA AGT 3 84 

Gin Val Glu Val Leu He Asp Lys Lys He Glu Glu Tyr Ala Lys Ser 
115 120 125 

AAA GCT CTT GCA GAG TTA CAG GGT CTT CAA AAT AAT TTC GAA GAT TAT 43 2 

Lys Ala Leu Ala Glu Leu Gin Gly Leu Gin Asn Asn Phe Glu Asp Tyr 
130 135 140 

GTT AAT GCG TTA AAT TCC TGG AAG AAA ACA CCT TTA AGT TTG CGA AGT 480 
Val Asn Ala Leu Asn Ser Trp Lys Lys Thr Pro Leu Ser Leu Arg Ser 
145 150 155 160 

AAA AGA AGC CAA GAT CGA ATA AGG GAA CTT TTT TCT CAA GCA GAA AGT 528 
Lys Arg Ser Gin Asp Arg He Arg Glu Leu Phe Ser Gin Ala Glu Ser 
165 170 175 

CAT TTT CGT AAT TCC ATG CCG TCA TTT GCA GTT TCC AAA TTC GAA GTG 576 
His Phe Arg Asn Ser Met Pro Ser Phe Ala Val Ser Lys Phe Glu Val 
180 185 190 

CTG TTT CTA CCA ACA TAT GCA CAA GCT GCA AAT ACA CAT TTA TTG CTA 624 
Leu Phe Leu Pro Thr Tyr Ala Gin Ala Ala Asn Thr His Leu Leu Leu 
195 200 205 

TTA AAA GAT GCT CAA GTT TTT GGA GAA GAA TGG GGA TAT TCT TCA GAA 672 
Leu Lys Asp Ala Gin Val Phe Gly Glu Glu Trp Gly Tyr Ser Ser Glu 
210 215 220 

GAT GTT GCT GAA TTT TAT CAT AGA CAA TTA AAA CTT ACA CAA CAA TAC 720 
Asp Val Ala Glu Phe Tyr His Arg Gin Leu Lys Leu Thr Gin Gin Tyr 
225 230 235 240 

ACT GAC CAT TGT GTT AAT TGG TAT AAT GTT GGA TTA AAT GGT TTA AGA 768 
Thr Asp His Cys Val Asn Trp Tyr Asn Val Gly Leu Asn Gly Leu Arg 
245 250 255 

GGT TCA ACT TAT GAT GCA TGG GTC AAA TTT AAC CGT TTT CGC AGA GAA 816 
Gly Ser Thr Tyr Asp Ala Trp Val Lys Phe Asn Arg Phe Arg Arg Glu 
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260 



265 



270 



ATG ACT TTA ACT GTA TTA GAT CTA ATT GTA CTT TTC CCA TTT TAC GAT 864 
Met Thr Leu Thr Val Leu Asp Leu lie Val Leu Phe Pro Phe Tyr Asp 
275 280 285 

ACT AGG CGA TTC AGA AAG GGG GTT AAA AC A GAA CTA ACA AGA GAC ATT 912 
Thr Arg Arg Phe Arg Lys Gly Val Lys Thr Glu Leu Thr Arg Asp lie 
290 295 300 

TTT ACG GAT CCA ATT TTT TCA CTT AAT ACT CTT CAG GAG TAT GGA CCA 96 0 

Phe Thr Asp Pro lie Phe Ser Leu Asn Thr Leu Gin Glu Tyr Gly Pro 
305 310 315 320 

ACT TTT TTG AGT ATA GAA AAC TCT ATT CGA AAA CCT CAT TTA TTT GAT 1008 
Thr Phe Leu Ser lie Glu Asn Ser lie Arg Lys Pro His Leu Phe Asp 
325 330 335 

TAT TTA CAG GGG ATT GAA TTT CAT ACG CGT CTT CAA CCT GGT TAC TTT 1056 
Tyr Leu Gin Gly lie Glu Phe His Thr Arg Leu Gin Pro Gly Tyr Phe 
340 345 350 

GGG AAA GAT TCT TTC AAT TAT TGG TCT GGT AAT TAT GTA GAA ACT AGA 1104 
Gly Lys Asp Ser Phe Asn Tyr Trp Ser Gly Asn Tyr Val Glu Thr Arg 
355 360 365 

CCT AGT ATA GGA TCT AGT AAG ACA ATT ACT TCC CCA TTT TAT GGA GAT 1152 
Pro Ser lie Gly Ser Ser Lys Thr lie Thr Ser Pro Phe Tyr Gly Asp 
370 375 380 

AAA TCT ACT GAA CCT GTA CAA AAG CTA AGC TTT GAT GGA CAA AAA GTT 1200 
Lys Ser Thr Glu Pro Val Gin Lys Leu Ser Phe Asp Gly Gin Lys Val 
385 390 395 400 

TAT CGA ACT ATA GCT AAT ACA GAC GTA GCG GCT TGG CCG AAT GGT AAG 124 8 

Tyr Arg Thr lie Ala Asn Thr Asp Val Ala Ala Trp Pro Asn Gly Lys 
405 410 415 

GTA TAT TTA GGT GTT ACG AAA GTT GAT TTT AGT CAA TAT GAT GAT CAA 1296 
Val Tyr Leu Gly Val Thr Lys Val Asp Phe Ser Gin Tyr Asp Asp Gin 
420 425 430 

AAA AAT GAA ACT AGT ACA CAA ACA TAT GAT TCA AAA AGA AAC AAT GGC 1344 
Lys Asn Glu Thr Ser Thr Gin Thr Tyr Asp Ser Lys Arg Asn Asn Gly 
435 440 445 

CAT GTA AGT GCA CAG GAT TCT ATT GAC CAA TTA CCG CCA GAA ACA ACA 1392 
His Val Ser Ala Gin Asp Ser lie Asp Gin Leu Pro Pro Glu Thr Thr 
450 455 460 

GAT GAA CCA CTT GAA AAA GCA TAT AGT CAT CAG CTT AAT TAC GCG GAA 1440 
Asp Glu Pro Leu Glu Lys Ala Tyr Ser His Gin Leu Asn Tyr Ala Glu 
465 470 475 480 



-287- 

A 135535<2WKVOI!.DOC) 



TGT TTC TTA ATG CAG GAC CGT CGT GGA ACA ATT CCA TTT TTT ACT TGG 
Cys Phe Leu Met Gin Asp Arg Arg Gly Thr lie Pro Phe Phe Thr Trp 
485 490 495 



ACA CAT AGA AGT GTA GAC TTT TTT AAT ACA ATT GAT GCT GAA AAG ATT 
Thr His Arg Ser Val Asp Phe Phe Asn Thr lie Asp Ala Glu Lys lie 
500 505 510 

ACT CAA CTT CCA GTA GTG AAA GCA TAT GCC TTG TCT TCA GGT GCT TCC 
Thr Gin Leu Pro Val Val Lys Ala Tyr Ala Leu Ser Ser Gly Ala Ser 
515* 520 525 

ATT ATT GAA GGT CCA GGA TTC ACA GGA GGA AAT TTA CTA TTC CTA AAA 
lie lie Glu Gly Pro Gly Phe Thr Gly Gly Asn Leu Leu Phe Leu Lys 
530 535 540 

GAA TCT AGT AAT TCA ATT GCT AAA TTT AAA GTT ACA TTA AAT TCA GCA 
Glu Ser Ser Asn Ser lie Ala Lys Phe Lys Val Thr Leu Asn Ser Ala 
545 550 555 560 

GCC TTG TTA CAA CGA TAT CGT GTA AGA ATA CGC TAT GCT TCT ACC ACT 
Ala Leu Leu Gin Arg Tyr Arg Val Arg lie Arg Tyr Ala Ser Thr Thr 
565 570 575 

AAC TTA CGA CTT TTT GTG CAA AAT TCA AAC AAT GAT TTT CTT GTC ATC 
Asn Leu Arg Leu Phe Val Gin Asn Ser Asn Asn Asp Phe Leu Val lie 
580 585 590 

TAC ATT AAT AAA ACT ATG AAT AAA GAT GAT GAT TTA ACA TAT CAA ACA 
Tyr lie Asn Lys Thr Met Asn Lys Asp Asp Asp Leu Thr Tyr Gin Thr 
595 600 605 

TTT GAT CTC GCA ACT ACT AAT TCT AAT ATG GGG TTC TCG GGT GAT AAG 
Phe Asp Leu Ala Thr Thr Asn Ser Asn Met Gly Phe Ser Gly Asp Lys 
610 615 620 

AAT GAA CTT ATA ATA GGA GCA GAA TCT TTC GTT TCT AAT GAA AAA ATC 
Asn Glu Leu lie lie Gly Ala Glu Ser Phe Val Ser Asn Glu Lys lie 
625 630 635 640 

TAT ATA GAT AAG ATA GAA TTT ATC CCA GTA CAA TTG TAA 
Tyr lie Asp Lys lie Glu Phe lie Pro Val Gin Leu 
645 650 



(2) INFORMATION FOR SEQ ID NO: 28: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 652 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 28: 



Met Asn Pro Asn Asn Arg Ser Glu His Asp Thr lie Lys Val Thr Pro 
15 10 15 

Asn Ser Glu Leu Gin Thr Asn His Asn Gin Tyr Pro Leu Ala Asp Asn 
20 25 30 

Pro Asn Ser Thr Leu Glu Glu Leu Asn Tyr Lys Glu Phe Leu Arg Met 
35 40 45 

Thr Glu Asp Ser Ser Thr Glu Val Leu Asp Asn Ser Thr Val Lys Asp 
50 55 60 

Ala Val Gly Thr Gly He Ser Val Val Gly Gin He Leu Gly Val Val 
65 70 75 80 

Gly Val Pro Phe Ala Gly Ala Leu Thr Ser Phe Tyr Gin Ser Phe Leu 
85 90 95 

Asn Thr He Trp Pro Ser Asp Ala Asp Pro Trp Lys Ala Phe Met Ala 
100 105 110 

Gin Val Glu Val Leu He Asp Lys Lys He Glu Glu Tyr Ala Lys Ser 
115 120 125 

Lys Ala Leu Ala Glu Leu Gin Gly Leu Gin Asn Asn Phe Glu Asp Tyr 
130 135 140 

Val Asn Ala Leu Asn Ser Trp Lys Lys Thr Pro Leu Ser Leu Arg Ser 
145 150 155 160 

Lys Arg Ser Gin Asp Arg He Arg Glu Leu Phe Ser Gin Ala Glu Ser 
165 170 175 

His Phe Arg Asn Ser Met Pro Ser Phe Ala Val Ser Lys Phe Glu Val 
180 185 190 

Leu Phe Leu Pro Thr Tyr Ala Gin Ala Ala Asn Thr His Leu Leu Leu 
195 200 205 

Leu Lys Asp Ala Gin Val Phe Gly Glu Glu Trp Gly Tyr Ser Ser Glu 
210 215 220 

Asp Val Ala Glu Phe Tyr His Arg Gin Leu Lys Leu Thr Gin Gin Tyr 
225 230 235 240 

Thr Asp His Cys Val Asn Trp Tyr Asn Val Gly Leu Asn Gly Leu Arg 
245 250 255 

Gly Ser Thr Tyr Asp Ala Trp Val Lys Phe Asn Arg Phe Arg Arg Glu 
260 265 270 
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Met Thr Leu Thr Val Leu Asp Leu lie Val Leu Phe Pro Phe Tyr Asp 
275 280 285 



Thr Arg Arg Phe Arg Lys Gly Val Lys Thr Glu Leu Thr Arg Asp lie 
290 295 300 

Phe Thr Asp Pro lie Phe Ser Leu Asn Thr Leu Gin Glu Tyr Gly Pro 
305 310 315 320 

Thr Phe Leu Ser He Glu Asn Ser He Arg Lys Pro His Leu Phe Asp 
325 330 335 

Tyr Leu Gin Gly lie Glu Phe His Thr Arg Leu Gin Pro Gly Tyr Phe 
340 345 350 

Gly Lys Asp Ser Phe Asn Tyr Trp Ser Gly Asn Tyr Val Glu Thr Arg 
355 360 365 

Pro Ser He Gly Ser Ser Lys Thr He Thr Ser Pro Phe Tyr Gly Asp 
370 375 380 

Lys Ser Thr Glu Pro Val Gin Lys Leu Ser Phe Asp Gly Gin Lys Val 
385 390 395 400 

Tyr Arg Thr He Ala Asn Thr Asp Val Ala Ala Trp Pro Asn Gly Lys 
405 410 415 

Val Tyr Leu Gly Val Thr Lys Val Asp Phe Ser Gin Tyr Asp Asp Gin 
420 425 430 

Lys Asn Glu Thr Ser Thr Gin Thr Tyr Asp Ser Lys Arg Asn Asn Gly 
435 440 445 

His Val Ser Ala Gin Asp Ser He Asp Gin Leu Pro Pro Glu Thr Thr 
450 455 460 

Asp Glu Pro Leu Glu Lys Ala Tyr Ser His Gin Leu Asn Tyr Ala Glu 
465 470 475 480 

Cys Phe Leu Met Gin Asp Arg Arg Gly Thr He Pro Phe Phe Thr Trp 
485 490 495 

Thr His Arg Ser Val Asp Phe Phe Asn Thr He Asp Ala Glu Lys He 
500 505 510 

Thr Gin Leu Pro Val Val Lys Ala Tyr Ala Leu Ser Ser Gly Ala Ser 
515 520 525 

He He Glu Gly Pro Gly Phe Thr Gly Gly Asn Leu Leu Phe Leu Lys 
530 535 540 

Glu Ser Ser Asn Ser He Ala Lys Phe Lys Val Thr Leu Asn Ser Ala 
545 550 555 560 
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Ala Leu Leu Gin 



Asn Leu Arg Leu 
580 

Tyr lie Asn Lys 
595 

Phe Asp Leu Ala 
610 

Asn Glu Leu lie 
625 

Tyr lie Asp Lys 



Arg Tyr Arg Val 
565 

Phe Val Gin Asn 



Thr Met Asn Lys 
600 

Thr Thr Asn Ser 
615 

He Gly Ala Glu 
630 

He Glu Phe He 
645 



Arg He Arg Tyr 
570 

Ser Asn Asn Asp 
585 

Asp Asp Asp Leu 



Asn Met Gly Phe 
620 

Ser Phe Val Ser 
635 

Pro Val Gin Leu 
650 



Ala Ser Thr Thr 
575 

Phe Leu Val lie 
590 

Thr Tyr Gin Thr 
605 

Ser Gly Asp Lys 



Asn Glu Lys He 
640 



(2) INFORMATION FOR SEQ ID NO : 2 9 : 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 195 9 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(ix) FEATURE: 

(A) NAME /KEY : CDS 

(B) LOCATION: 1. .1956 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:29: 



ATG AAT CCA AAC AAT CGA AGT GAA CAT GAT ACG ATA AAG GTT ACA CCT 48 
Met Asn Pro Asn Asn Arg Ser Glu His Asp Thr He Lys Val Thr Pro 
15 10 15 



AAC AGT GAA TTG CAA ACT AAC CAT AAT CAA TAT CCT TTA GCT GAC AAT 96 
Asn Ser Glu Leu Gin Thr Asn His Asn Gin Tyr Pro Leu Ala Asp Asn 
20 25 30 



CCA AAT TCA ACA CTA GAA GAA TTA AAT TAT AAA GAA TTT TTA AGA ATG 144 
Pro Asn Ser Thr Leu Glu Glu Leu Asn Tyr Lys Glu Phe Leu Arg Met 
35 40 45 



ACT GAA GAC AGT TCT ACG GAA GTG CTA GAC AAC TCT ACA GTA AAA GAT 192 
Thr Glu Asp Ser Ser Thr Glu Val Leu Asp Asn Ser Thr Val Lys Asp 
50 55 60 



GCA GTT GGG ACA GGA ATT TCT GTT GTA GGG CAG ATT TTA GGT GTT GTA 240 
Ala Val Gly Thr Gly He Ser Val Val Gly Gin He Leu Gly Val Val 
65 70 75 80 
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GGA GTT CCA TTT GCT GGG GCA CTC ACT TCA TTT TAT CAA TCA TTT CTT 238 
Gly Val Pro Phe Ala Gly Ala Leu Thr Ser Phe Tyr Gin Ser Phe Leu 
85 90 95 

AAC ACT ATA TGG CCA AGT GAT GCT GAC CCA TGG AAG GCT TTT ATG GCA 3 36 

Asn Thr lie Trp Pro Ser Asp Ala Asp Pro Trp Lys Ala Pbe Met Ala 
100 105 110 

CAA GTT GAA GTA CTG ATA GAT AAG AAA ATA GAG GAG TAT GCT AAA AGT 3 84 

Gin Val Glu Val Leu lie Asp Lys Lys lie Glu Glu Tyr Ala Lys Ser 
115 120 125 

AAA GCT CTT GCA GAG TTA CAG GGT CTT CAA AAT AAT TTC GAA GAT TAT 4 32 

Lys Ala Leu Ala Glu Leu Gin Gly Leu Gin Asn Asn Phe Glu Asp Tyr 
130 135 140 

GTT AAT GCG TTA AAT TCC TGG AAG AAA ACA CCT TTA AGT TTG CGA AGT 4 80 

Val Asn Ala Leu Asn Ser Trp Lys Lys Thr Pro Leu Ser Leu Arg Ser 
145 150 155 160 

AAA AGA AGC CAA GAT CGA ATA AGG GAA CTT TTT TCT CAA GCA GAA AGT 5 28 

Lys Arg Ser Gin Asp Arg lie Arg Glu Leu Phe Ser Gin Ala Glu Ser 
165 170 175 

CAT TTT CGT AAT TCC ATG CCG TCA TTT GCA GTT TCC AAA TTC GAA GTG 5 76 

His Phe Arg Asn Ser Met Pro Ser Phe Ala Val Ser Lys Phe Glu Val 
180 185 190 

CTG TTT CTA CCA ACA TAT GCA CAA GCT GCA AAT ACA CAT TTA TTG CTA 624 
Leu Phe Leu Pro Thr Tyr Ala Gin Ala Ala Asn Thr His Leu Leu Leu 
195 200 205 

TTA AAA GAT GCT CAA GTT TTT GGA GAA GAA TGG GGA TAT TCT TCA GAA 6 72 

Leu Lys Asp Ala Gin Val Phe Gly Glu Glu Trp Gly Tyr Ser Ser Glu 
210 215 220 

GAT GTT GCT GAA TTC TAT CGT AGA CAA TTA AAA CTT ACA CAA CAA TAC 720 
Asp Val Ala Glu Phe Tyr Arg Arg Gin Leu Lys Leu Thr Gin Gin Tyr 
225 230 235 240 

ACT GAC CAT TGT GTT AAT TGG TAT AAT GTT GGA TTA AAT GGT TTA AGA 768 
Thr Asp His Cys Val Asn Trp Tyr Asn Val Gly Leu Asn Gly Leu Arg 
245 250 255 

GGT TCA ACT TAT GAT GCA TGG GTC AAA TTT AAC CGT TTT CGC AGA GAA 816 
Gly Ser Thr Tyr Asp Ala Trp Val Lys Phe Asn Arg Phe Arg Arg Glu 
260 265 270 

ATG ACT TTA ACT GTA TTA GAT CTA ATT GTA CTT TTC CCA TTT TAT GAT 864 
Met Thr Leu Thr Val Leu Asp Leu lie Val Leu Phe Pro Phe Tyr Asp 
275 280 285 
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ATT CGG TTA TAC TCA AAA GGG GTT AAA ACA GAA CTA ACA AGA GAC ATT 912 
lie Arg Leu Tyr Ser Lys Gly Val Lys Thr Glu Leu Thr Arg Asp lie 
290 295 300 

TTT ACG GAT CCA ATT TTT TTA CTT AAT ACT CTT CAG GAG TAT GGA CCA 96 0 

Phe Thr Asp Pro lie Phe Leu Leu Asn Thr Leu Gin Glu Tyr Gly Pro 
305 310 315 320 

ACT TTT TTG AGT ATA GAA AAC TCT ATT CGA AAA CCT CAT TTA TTT GAT 1008 
Thr Phe Leu Ser lie Glu Asn Ser lie Arg Lys Pro His Leu Phe Asp 
325 330 335 

TAT TTA CAG GGG ATT GAA TTT CAT ACG CGT CTT CAA CCT GGT TAC TTT 1056 
Tyr Leu Gin Gly lie Glu Phe His Thr Arg Leu Gin Pro Gly Tyr Phe 
340 345 350 

GGG AAA GAT TCT TTC AAT TAT TGG TCT GGT AAT TAT GTA GAA ACT AGA 1104 
Gly Lys Asp Ser Phe Asn Tyr Trp Ser Gly Asn Tyr Val Glu Thr Arg 
355 360 365 

CCT AGT ATA GGA TCT AGT AAG ACA ATT ACT TCC CCA TTT TAT GGA GAT 1152 
Pro Ser lie Gly Ser Ser Lys Thr lie Thr Ser Pro Phe Tyr Gly Asp 
370 375 380 

AAA TCT ACT GAA CCT GTA CAA AAG CTA AGC TTT GAT GGA CAA AAA GTT 12 00 

Lys Ser Thr Glu Pro Val Gin Lys Leu Ser Phe Asp Gly Gin Lys Val 
385 390 395 400 

TAT CGA ACT ATA GCT AAT ACA GAC GTA GCG GCT TGG CCG AAT GGT AAG 1248 
Tyr Arg Thr lie Ala Asn Thr Asp Val Ala Ala Trp Pro Asn Gly Lys 
405 410 415 

GTA TAT TTA GGT GTT ACG AAA GTT GAT TTT AGT CAA TAT GAT GAT CAA 1296 
Val Tyr Leu Gly Val Thr Lys Val Asp Phe Ser Gin Tyr Asp Asp Gin 
420 425 430 

AAA AAT GAA ACT AGT ACA CAA ACA TAT GAT TCA AAA AGA AAC AAT GGC 1344 
Lys Asn Glu Thr Ser Thr Gin Thr Tyr Asp Ser Lys Arg Asn Asn Gly 
435 440 445 

CAT GTA AGT GCA CAG GAT TCT ATT GAC CAA TTA CCG CCA GAA ACA ACA 13 92 

His Val Ser Ala Gin Asp Ser lie Asp Gin Leu Pro Pro Glu Thr Thr 
450 455 460 

GAT GAA CCA CTT GAA AAA GCA TAT AGT CAT CAG CTT AAT TAC GCG GAA 1440 
Asp Glu Pro Leu Glu Lys Ala Tyr Ser His Gin Leu Asn Tyr Ala Glu 
465 470 475 480 

TGT TTC TTA ATG CAG GAC CGT CGT GGA ACA ATT CCA TTT TTT ACT TGG 14 88 

Cys Phe Leu Met Gin Asp Arg Arg Gly Thr lie Pro Phe Phe Thr Trp 
485 . 490 495 
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ACA CAT AGA AGT GTA GAC TTT TTT AAT ACA ATT GAT GCT GAA AAG ATT 
Thr His Arg Ser Val Asp Phe Phe Asn Thr lie Asp Ala Glu Lys lie 
500 505 510 



1536 



ACT CAA CTT CCA GTA GTG AAA GCA TAT GCC TTG TCT TCA GGT GCT TCC 1584 
Thr Gin Leu Pro Val Val Lys Ala Tyr Ala Leu Ser Ser Gly Ala Ser 
515 520 525 

ATT ATT GAA GGT CCA GGA TTC ACA GGA GGA AAT TTA CTA TTC CTA AAA 1632 
lie lie Glu Gly Pro Gly Phe Thr Gly Gly Asn Leu Leu Phe Leu Lys 
530 535 540 

GAA TCT AGT AAT TCA ATT GCT AAA TTT AAA GTT ACA TTA AAT TCA GCA 168 0 

Glu Ser Ser Asn Ser lie Ala Lys Phe Lys Val Thr Leu Asn Ser Ala 
545 550 555 560 

GCC TTG TTA CAA CGA TAT CGT GTA AGA ATA CGC TAT GCT TCT ACC ACT 172 8 

Ala Leu Leu Gin Arg Tyr Arg Val Arg lie Arg Tyr Ala Ser Thr Thr 
565 570 575 

AAC TTA CGA CTT TTT GTG CAA AAT TCA AAC AAT GAT TTT CTT GTC ATC 17 76 

Asn Leu Arg Leu Phe Val Gin Asn Ser Asn Asn Asp Phe Leu Val lie 
580 585 590 

TAC ATT AAT AAA ACT ATG AAT AAA GAT GAT GAT TTA ACA TAT CAA ACA 1824 
Tyr lie Asn Lys Thr Met Asn Lys Asp Asp Asp Leu Thr Tyr Gin Thr 
595 600 605 

TTT GAT CTC GCA ACT ACT AAT TCT AAT ATG GGG TTC TCG GGT GAT AAG 1872 
Phe Asp Leu Ala Thr Thr Asn Ser Asn Met Gly Phe Ser Gly Asp Lys 
610 615 620 

AAT GAA CTT ATA ATA GGA GCA GAA TCT TTC GTT TCT AAT GAA AAA ATC 1920 
Asn Glu Leu lie lie Gly Ala Glu Ser Phe Val Ser Asn Glu Lys lie 
625 630 635 640 

TAT ATA GAT AAG ATA GAA TTT ATC CCA GTA CAA TTG TAA 1959 
Tyr lie Asp Lys lie Glu Phe lie Pro Val Gin Leu 
645 650 



(2) INFORMATION FOR SEQ ID NO: 30: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 652 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 30: 

Met Asn Pro Asn Asn Arg Ser Glu His Asp Thr lie Lys Val Thr Pro 
15 10 15 
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Asn Ser Glu Leu Gin Thr Asn His Asn Gin Tyr Pro Leu Ala Asp Asn 
20 25 30 



Pro Asn Ser Thr Leu Glu Glu Leu Asn Tyr Lys Glu Phe Leu Arg Met 
35 40 45 

Thr Glu Asp Ser Ser Thr Glu Val Leu Asp Asn Ser Thr Val Lys Asp 
50 55 60 

Ala Val Gly Thr Gly He Ser Val Val Gly Gin lie Leu Gly Val Val 
65 70 75 80 

Gly Val Pro Phe Ala Gly Ala Leu Thr Ser Phe Tyr Gin Ser Phe Leu 
85 90 95 

Asn Thr He Trp Pro Ser Asp Ala Asp Pro Trp Lys Ala Phe Met Ala 
100 105 110 

Gin Val Glu Val Leu He Asp Lys Lys He Glu Glu Tyr Ala Lys Ser 
115 120 125 

Lys Ala Leu Ala Glu Leu Gin Gly Leu Gin Asn Asn Phe Glu Asp Tyr 
130 135 140 

Val Asn Ala Leu Asn Ser Trp Lys Lys Thr Pro Leu Ser Leu Arg Ser 
145 150 155 160 

Lys Arg Ser Gin Asp Arg He Arg Glu Leu Phe Ser Gin Ala Glu Ser 
165 170 175 

His Phe Arg Asn Ser Met Pro Ser Phe Ala Val Ser Lys Phe Glu Val 
180 185 190 

Leu Phe Leu Pro Thr Tyr Ala Gin Ala Ala Asn Thr His Leu Leu Leu 
195 200 205 

Leu Lys Asp Ala Gin Val Phe Gly Glu Glu Trp Gly Tyr Ser Ser Glu 
210 215 220 

Asp Val Ala Glu Phe Tyr Arg Arg Gin Leu Lys Leu Thr Gin Gin Tyr 
225 230 235 240 

Thr Asp His Cys Val Asn Trp Tyr Asn Val Gly Leu Asn Gly Leu Arg 
245 250 255 

Gly Ser Thr Tyr Asp Ala Trp Val Lys Phe Asn Arg Phe Arg Arg Glu 
260 265 270 

Met Thr Leu Thr Val Leu Asp Leu He Val Leu Phe Pro Phe Tyr Asp 
275 280 285 

He Arg Leu Tyr Ser Lys Gly Val Lys Thr Glu Leu Thr Arg Asp He 
290 295 300 
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Phe Thr Asp Pro lie Phe Leu Leu Asn Thr Leu Gin Glu Tyr Gly Pro 
305 310 315 320 



Thr Phe Leu Ser lie Glu Asn Ser lie Arg Lys Pro His Leu Phe Asp 
325 330 335 

Tyr Leu Gin Gly lie Glu Phe His Thr Arg Leu Gin Pro Gly Tyr Phe 
340 345 350 

Gly Lys Asp Ser Phe Asn Tyr Trp Ser Gly Asn Tyr Val Glu Thr Arg 
355 360 365 

Pro Ser lie Gly Ser Ser Lys Thr He Thr Ser Pro Phe Tyr Gly Asp 
370 375 380 

Lys Ser Thr Glu Pro Val Gin Lys Leu Ser Phe Asp Gly Gin Lys Val 
385 390 395 400 

Tyr Arg Thr He Ala Asn Thr Asp Val Ala Ala Trp Pro Asn Gly Lys 
405 410 415 

Val Tyr Leu Gly Val Thr Lys Val Asp Phe Ser Gin Tyr Asp Asp Gin 
420 425 430 

Lys Asn Glu Thr Ser Thr Gin Thr Tyr Asp Ser Lys Arg Asn Asn Gly 
435 440 445 

His Val Ser Ala Gin Asp Ser He Asp Gin Leu Pro Pro Glu Thr Thr 
450 455 460 

Asp Glu Pro Leu Glu Lys Ala Tyr Ser His Gin Leu Asn Tyr Ala Glu 
465 470 475 480 

Cys Phe Leu Met Gin Asp Arg Arg Gly Thr He Pro Phe Phe Thr Trp 
485 490 495 

Thr His Arg Ser Val Asp Phe Phe Asn Thr He Asp Ala Glu Lys He 
500 505 510 

Thr Gin Leu Pro Val Val Lys Ala Tyr Ala Leu Ser Ser Gly Ala Ser 
515 520 525 

lie lie Glu Gly Pro Gly Phe Thr Gly Gly Asn Leu Leu Phe Leu Lys 
530 535 540 

Glu Ser Ser Asn Ser lie Ala Lys Phe Lys Val Thr Leu Asn Ser Ala 
545 550 555 560 

Ala Leu Leu Gin Arg Tyr Arg Val Arg lie Arg Tyr Ala Ser Thr Thr 
565 570 575 

Asn Leu Arg Leu Phe Val Gin Asn Ser Asn Asn Asp Phe Leu Val He 
580 585 590 
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Tyr lie Asn Lys Thr Met Asn Lys 
595 600 

Phe Asp Leu Ala Thr Thr Asn Ser 
610 615 

Asn Glu Leu lie lie Gly Ala Glu 
625 630 

Tyr lie Asp Lys lie Glu Phe lie 
645 



Asp Asp Asp Leu Thr Tyr Gin Thr 
605 

Asn Met Gly Phe Ser Gly Asp Lys 
620 

Ser Phe Val Ser Asn Glu Lys lie 
635 640 

Pro Val Gin Leu 
650 



(2) INFORMATION FOR SEQ ID NO: 31: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1959 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(ix) FEATURE: 

(A) NAME /KEY : CDS 

(B) LOCATION: 1. .1956 



<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 31: 



ATG AAT CCA AAC AAT CGA AGT GAA CAT GAT ACG ATA AAG GTT ACA CCT 48 
Met Asn Pro Asn Asn Arg Ser Glu His Asp Thr lie Lys Val Thr Pro 
15 10 15 



AAC AGT GAA TTG CAA ACT AAC CAT AAT CAA TAT CCT TTA GCT GAC AAT 96 
Asn Ser Glu Leu Gin Thr Asn His Asn Gin Tyr Pro Leu Ala Asp Asn 
20 25 30 



CCA AAT TCA ACA CTA GAA GAA TTA AAT TAT AAA GAA TTT TTA AGA ATG 144 
Pro Asn Ser Thr Leu Glu Glu Leu Asn Tyr Lys Glu Phe Leu Arg Met 
35 40 45 



ACT GAA GAC AGT TCT ACG GAA GTG CTA GAC AAC TCT ACA GTA AAA GAT 192 
Thr Glu Asp Ser Ser Thr Glu Val Leu Asp Asn Ser Thr Val Lys Asp 
50 55 60 



GCA GTT GGG ACA GGA ATT TCT GTT GTA GGG CAG ATT TTA GGT GTT GTA 240 
Ala Val Gly Thr Gly He Ser Val Val Gly Gin He Leu Gly Val Val 
65 70 75 80 



GGA GTT CCA TTT GCT GGG GCA CTC ACT TCA TTT TAT CAA TCA TTT CTT 
Gly Val Pro Phe Ala Gly Ala Leu Thr Ser Phe Tyr Gin Ser Phe Leu 
85 90 95 



288 
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AAC ACT ATA TGG CCA AGT GAT GCT GAC CCA TGG AAG GCT TTT ATG GCA 3 36 

Asn Thr lie Trp Pro Ser Asp Ala Asp Pro Trp Lys Ala Phe Met Ala 
100 105 110 

CAA GTT GAA GTA CTG ATA GAT AAG AAA ATA GAG GAG TAT GCT AAA AGT 3 84 

Gin Val Glu Val Leu lie Asp Lys Lys lie Glu Glu Tyr Ala Lys Ser 
115 120 125 

AAA GCT CTT GCA GAG TTA CAG GGT CTT CAA AAT AAT TTC GAA GAT TAT* 4 32 

Lys Ala Leu Ala Glu Leu Gin Gly Leu Gin Asn Asn Phe Glu Asp Tyr 
130 135 140 

GTT AAT GCG TTA AAT TCC TGG AAG AAA ACA CCT TTA AGT TTG CGA AGT 4 80 

Val Asn Ala Leu Asn Ser Trp Lys Lys Thr Pro Leu Ser Leu Arg Ser 
145 150 155 160 

AAA AGA AGC CAA GAT CGA ATA AGG GAA CTT TTT TCT CAA GCA GAA AGT 5 28 

Lys Arg Ser Gin Asp Arg lie Arg Glu Leu Phe Ser Gin Ala Glu Ser 
165 170 175 

CAT TTT CGT AAT TCC ATG CCG TCA TTT GCA GTT TCC AAA TTC GAA GTG 5 76 

His Phe Arg Asn Ser Met Pro Ser Phe Ala Val Ser Lys Phe Glu Val 
180 185 190 

CTG TTT CTA CCA ACA TAT GCA CAA GCT GCA AAT ACA CAT TTA TTG CTA 624 
Leu Phe Leu Pro Thr Tyr Ala Gin Ala Ala Asn Thr His Leu Leu Leu 
195 200 205 

TTA AAA GAT GCT CAA GTT TTT GGA GAA GAA TGG GGA TAT TCT TCA GAA 6 72 

Leu Lys Asp Ala Gin Val Phe Gly Glu Glu Trp Gly Tyr Ser Ser Glu 
210 215 220 

GAT GTT GCT GAA TTT TAT CAT AGA CAA TTA AAA CTT ACA CAA CAA TAC 720 
Asp Val Ala Glu Phe Tyr His Arg Gin Leu Lys Leu Thr Gin Gin Tyr 
225 230 235 240 

ACT GAC CAT TGT GTT AAT TGG TAT AAT GTT GGA TTA AAT GGT TTA AGA 768 
Thr Asp His Cys Val Asn Trp Tyr Asn Val Gly Leu Asn Gly Leu Arg 
245 250 255 

GGT TCA ACT TAT GAT GCA TGG GTC AAA TTT AAC CGT TTT CGC AGA GAA 816 
Gly Ser Thr Tyr Asp Ala Trp Val Lys Phe Asn Arg Phe Arg Arg Glu 
260 265 270 

ATG ACT TTA ACT GTA TTA GAT CTA ATT GTA CTT TTC CCA TTT TAT GAT 864 
Met Thr Leu Thr Val Leu Asp Leu lie Val Leu Phe Pro Phe Tyr Asp 
275 280 285 

ATT CGG TTA TAC TCA AAA GGG GTT AAA ACA GAA CTA ACA AGA GAC ATT 912 
lie Arg Leu Tyr Ser Lys Gly Val Lys Thr Glu Leu Thr Arg Asp lie 
290 295 300 
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TTT ACG GAT CCA ATT TTT ATC CTC AAT ACG CTA CAG GAG TAC GGA CCA 96 0 

Phe Thr Asp Pro lie Phe lie Leu Asn Thr Leu Gin Glu Tyr Gly Pro 
305 310 315 320 

ACT TTT TTG AGT ATA GAA AAC TCT ATT CGA AAA CCT CAT TTA TTT GAT 1008 
Thr Phe Leu Ser lie Glu Asn Ser He Arg Lys Pro His Leu Phe Asp 
325 330 335 

TAT TTA CAG GGG ATT GAA TTT CAT ACG CGT CTT CAA CCT GGT TAC TTT 10 56 

Tyr Leu Gin Gly He Glu Phe His Thr Arg Leu Gin Pro Gly Tyr Phe 
340 345 350 

GGG AAA GAT TCT TTC AAT TAT TGG TCT GGT AAT TAT GTA GAA ACT AGA 1104 
Gly Lys Asp Ser Phe Asn Tyr Trp Ser Gly Asn Tyr Val Glu Thr Arg 
355 360 365 

CCT AGT ATA GGA TCT AGT AAG ACA ATT ACT TCC CCA TTT TAT GGA GAT 1152 
Pro Ser He Gly Ser Ser Lys Thr He Thr Ser Pro Phe Tyr Gly Asp 
370 375 380 

AAA TCT ACT GAA CCT GTA CAA AAG CTA AGC TTT GAT GGA CAA AAA GTT - 12 00 

Lys Ser Thr Glu Pro Val Gin Lys Leu Ser Phe Asp Gly Gin Lys Val 
385 390 395 400 

TAT CGA ACT ATA GCT AAT ACA GAC GTA GCG GCT TGG CCG AAT GGT AAG 1248 
Tyr Arg Thr He Ala Asn Thr Asp Val Ala Ala Trp Pro Asn Gly Lys 
405 410 415 

GTA TAT TTA GGT GTT ACG AAA GTT GAT TTT AGT CAA TAT GAT GAT CAA 12 96 

Val Tyr Leu Gly Val Thr Lys Val Asp Phe Ser Gin Tyr Asp Asp Gin 
420 425 430 

AAA AAT GAA ACT AGT ACA CAA ACA TAT GAT TCA AAA AGA AAC AAT GGC 1344 
Lys Asn Glu Thr Ser Thr Gin Thr Tyr Asp Ser Lys Arg Asn Asn Gly 
435 440 445 

CAT GTA AGT GCA CAG GAT TCT ATT GAC CAA TTA CCG CCA GAA ACA ACA 13 92 

His Val Ser Ala Gin Asp Ser lie Asp Gin Leu Pro Pro Glu Thr Thr 
450 455 460 

GAT GAA CCA CTT GAA AAA GCA TAT AGT CAT CAG CTT AAT TAC GCG GAA 1440 
Asp Glu Pro Leu Glu Lys Ala Tyr Ser His Gin Leu Asn Tyr Ala Glu 
465 470 475 480 

TGT TTC TTA ATG CAG GAC CGT CGT GGA ACA ATT CCA TTT TTT ACT TGG 1488 
Cys Phe Leu Met Gin Asp Arg Arg Gly Thr He Pro Phe Phe Thr Trp 
485 490 495 

ACA CAT AGA AGT GTA GAC TTT TTT AAT ACA ATT GAT GCT GAA AAG ATT 1536 
Thr His Arg Ser Val Asp Phe Phe Asn Thr He Asp Ala Glu Lys He 
500 505 510 
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ACT CAA CTT CCA GTA GTG AAA GCA TAT GCC TTG TCT TCA GGT GCT TCC 15 84 

Thr Gin Leu Pro Val Val Lys Ala Tyr Ala Leu Ser Ser Gly Ala Ser 
515 520 525 

ATT ATT GAA GGT CCA GGA TTC ACA GGA GGA AAT TTA CTA TTC CTA AAA 16 3 2 

lie lie Glu Gly Pro Gly Phe Thr Gly Gly Asn Leu Leu Phe Leu Lys 
530 535 540 

GAA TCT AGT AAT TCA ATT GCT AAA TTT AAA GTT ACA TTA AAT TCA GCA 16 8 0 

Glu Ser Ser Asn Ser He Ala Lys Phe Lys Val Thr Leu Asn Ser Ala 
545 550 555 560 

GCC TTG TTA CAA CGA TAT CGT GTA AGA ATA CGC TAT GCT TCT ACC ACT 172 8 

Ala Leu Leu Gin Arg Tyr Arg Val Arg lie Arg Tyr Ala Ser Thr Thr 
565 570 575 

AAC TTA CGA CTT TTT GTG CAA AAT TCA AAC AAT GAT TTT CTT GTC ATC 1776 
Asn Leu Arg Leu Phe Val Gin Asn Ser Asn Asn Asp Phe Leu Val He 
580 585 590 

TAC ATT AAT AAA ACT ATG AAT AAA GAT GAT GAT TTA ACA TAT CAA ACA 1824 
Tyr He Asn Lys Thr Met Asn Lys Asp Asp Asp Leu Thr Tyr Gin Thr 
595 600 605 

TTT GAT CTC GCA ACT ACT AAT TCT AAT ATG GGG TTC TCG GGT GAT AAG 1872 
Phe Asp Leu Ala Thr Thr Asn Ser Asn Met Gly Phe Ser Gly Asp Lys 
610 615 620 

AAT GAA CTT ATA ATA GGA GCA GAA TCT TTC GTT TCT AAT GAA AAA ATC 192 0 

Asn Glu Leu He He Gly Ala Glu Ser Phe Val Ser Asn Glu Lys lie 
625 630 635 640 

TAT ATA GAT AAG ATA GAA TTT ATC CCA GTA CAA TTG TAA 1959 
Tyr He Asp Lys He Glu Phe He Pro Val Gin Leu 
645 650 



(2) INFORMATION FOR SEQ ID NO:32: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6 52 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 32: 

Met Asn Pro Asn Asn Arg Ser Glu His Asp Thr He Lys Val Thr Pro 
15 10 15 

Asn Ser Glu Leu Gin Thr Asn His Asn Gin Tyr Pro Leu Ala Asp Asn 
20 25 30 
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Pro Asn Ser Thr Leu Glu Glu Leu Asn Tyr Lys Glu Phe Leu Arg Met 
35 40 45 



Thr Glu Asp Ser 
50 

Ala Val Gly Thr 
65 

Gly Val Pro Phe 



Asn Thr lie Trp 
100 

Gin Val Glu Val 
115 

Lys Ala Leu Ala 
130 

Val Asn Ala Leu 
145 

Lys Arg Ser Gin 



His Phe Arg Asn 
180 

Leu Phe Leu Pro 
195 



Ser Thr Glu Val 
55 

Gly He Ser Val 
70 

Ala Gly Ala Leu 
85 

Pro Ser Asp Ala 



Leu He Asp Lys 
120 

Glu Leu Gin Gly 
135 

Asn Ser Trp Lys 
150 

Asp Arg He Arg 
165 

Ser Met Pro Ser 



Thr Tyr Ala Gin 
200 



Leu Asp Asn Ser 
60 

Val Gly Gin He 
75 

Thr Ser Phe Tyr 
90 

Asp Pro Trp Lys 
105 

Lys He Glu Glu 



Leu Gin Asn Asn 
140 

Lys Thr Pro Leu 
155 

Glu Leu Phe Ser 
170 

Phe Ala Val Ser 
185 

Ala Ala Asn Thr 



Thr Val Lys Asp 



Leu Gly Val Val 
80 

Gin Ser Phe Leu 
95 

Ala Phe Met Ala 
110 

Tyr Ala Lys Ser 
125 

Phe Glu Asp Tyr 



Ser Leu Arg Ser 
160 

Gin Ala Glu Ser 
175 

Lys Phe Glu Val 
190 

His Leu Leu Leu 
205 



Leu Lys Asp Ala Gin Val Phe Gly Glu Glu Trp Gly Tyr Ser Ser Glu 
210 215 220 



Asp Val Ala Glu Phe Tyr His Arg 
225 230 

Thr Asp His Cys Val Asn Trp Tyr 
245 

Gly Ser Thr Tyr Asp Ala Trp Val 
260 

Met Thr Leu Thr Val Leu Asp Leu 
275 280 



Gin Leu Lys Leu Thr Gin Gin Tyr 
235 240 

Asn Val Gly Leu Asn Gly Leu Arg 
250 255 

Lys Phe Asn Arg Phe Arg Arg Glu 
265 270 

He Val Leu Phe Pro Phe Tyr Asp 
285 



He Arg Leu Tyr Ser Lys Gly Val Lys Thr Glu Leu Thr Arg Asp He 

290 295 300 

Phe Thr Asp Pro He Phe He Leu Asn Thr Leu Gin Glu Tyr Gly Pro 

305 310 315 320 
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Thr Phe Leu Ser lie Glu Asn Ser lie Arg Lys Pro His Leu Phe Asp 
325 330 335 



Tyr Leu Gin Gly lie Glu Phe His Thr Arg Leu Gin Pro Gly Tyr Phe 
340 345 350 

Gly Lys Asp Ser Phe Asn Tyr Trp Ser Gly Asn Tyr Val Glu Thr Arg 
355 360 365 

Pro Ser lie Gly Ser Ser Lys Thr lie Thr Ser Pro Phe Tyr Gly Asp 
370 375 380 

Lys Ser Thr Glu Pro Val Gin Lys Leu Ser Phe Asp Gly Gin Lys Val 
385 390 395 400 

Tyr Arg Thr lie Ala Asn Thr Asp Val Ala Ala Trp Pro Asn Gly Lys 
405 410 415 

Val Tyr Leu Gly Val Thr Lys Val Asp Phe Ser Gin Tyr Asp Asp Gin 
420 425 430 

Lys Asn Glu Thr Ser Thr Gin Thr Tyr Asp Ser Lys Arg Asn Asn Gly 
435 440 445 

His Val Ser Ala Gin Asp Ser lie Asp Gin Leu Pro Pro Glu Thr Thr 
450 455 460 

Asp Glu Pro Leu Glu Lys Ala Tyr Ser His Gin Leu Asn Tyr Ala Glu 
465 470 475 480 

Cys Phe Leu Met Gin Asp Arg Arg Gly Thr lie Pro Phe Phe Thr Trp 
485 490 495 

Thr His Arg Ser Val Asp Phe Phe Asn Thr lie Asp Ala Glu Lys lie 
500 505 510 

Thr Gin Leu Pro Val Val Lys Ala Tyr Ala Leu Ser Ser Gly Ala Ser 
515 520 525 

lie lie Glu Gly Pro Gly Phe Thr Gly Gly Asn Leu Leu Phe Leu Lys 
530 535 540 

Glu Ser Ser Asn Ser lie Ala Lys Phe Lys Val Thr Leu Asn Ser Ala 
545 550 555 560 

Ala Leu Leu Gin Arg Tyr Arg Val Arg lie Arg Tyr Ala Ser Thr Thr 
565 570 575 

Asn Leu Arg Leu Phe Val Gin Asn Ser Asn Asn Asp Phe Leu Val lie 
580 585 590 

Tyr lie Asn Lys Thr Met Asn Lys Asp Asp Asp Leu Thr Tyr Gin Thr 
595 600 605 
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Phe Asp Leu Ala Thr 
610 

Asn Glu Leu lie lie 
625 

Tyr lie Asp Lys lie 
645 



Thr Asn Ser Asn Met Gly 
615 

Gly Ala Glu Ser Phe Val 
630 635 

Glu Phe He Pro Val Gin 
650 



Phe Ser Gly Asp Lys 
620 

Ser Asn Glu Lys He 
640 

Leu 



(2) INFORMATION FOR SEQ ID NO: 33: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 195 9 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ix) FEATURE: 

(A) NAME / KEY : CDS 

(B) LOCATION: 1. .1956 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 33: 

ATG AAT CCA AAC AAT CGA AGT GAA CAT GAT ACG ATA AAG GTT ACA CCT 
Met Asn Pro Asn Asn Arg Ser Glu His Asp Thr He Lys Val Thr Pro 
15 10 15 

AAC AGT GAA TTG CAA ACT AAC CAT AAT CAA TAT CCT TTA GCT GAC AAT 
Asn Ser Glu Leu Gin Thr Asn His Asn Gin Tyr Pro Leu Ala Asp Asn 
20 25 30 

CCA AAT TCA ACA CTA GAA GAA TTA AAT TAT AAA GAA TTT TTA AGA ATG 
Pro Asn Ser Thr Leu Glu Glu Leu Asn Tyr Lys Glu Phe Leu Arg Met 
35 40 45 

ACT GAA GAC AGT TCT ACG GAA GTG CTA GAC AAC TCT ACA GTA AAA GAT 
Thr Glu Asp Ser Ser Thr Glu Val Leu Asp Asn Ser Thr Val Lys Asp 
50 55 60 

GCA GTT GGG ACA GGA ATT TCT GTT GTA GGG CAG ATT TTA GGT GTT GTA 
Ala Val Gly Thr Gly He Ser Val Val Gly Gin He Leu Gly Val Val 
65 70 75 80 

1 

GGA GTT CCA TTT GCT GGG GCA CTC ACT TCA TTT TAT CAA TCA TTT CTT 

Gly Val Pro Phe Ala Gly Ala Leu Thr Ser Phe Tyr Gin Ser Phe Leu 
85 90 95 

AAC ACT ATA TGG CCA AGT GAT GCT GAC CCA TGG AAG GCT TTT ATG GCA 
Asn Thr He Trp Pro Ser Asp Ala Asp Pro Trp Lys Ala Phe Met Ala 
100 105 110 



A. I35535(2WKV0M DOC) 



-303- 



CAA GTT GAA GTA CTG ATA GAT AAG AAA ATA GAG GAG TAT GCT AAA AGT 384 
Gin Val Glu Val Leu lie Asp Lys Lys lie Glu Glu Tyr Ala Lys Ser 
115 120 125 

AAA GCT CTT GCA GAG TTA CAG GGT CTT CAA AAT AAT TTC GAA GAT TAT 432 
Lys Ala Leu Ala Glu Leu Gin Gly Leu Gin Asn Asn Phe Glu Asp Tyr 
130 135 140 

GTT AAT GCG TTA AAT TCC TGG AAG AAA ACA CCT TTA AGT TTG CGA AGT 480 
Val Asn Ala Leu Asn Ser Trp Lys Lys Thr Pro Leu Ser Leu Arg Ser 
145 150 155 160 

AAA AGA AGC CAA GAT CGA ATA AGG GAA CTT TTT TCT CAA GCA GAA AGT 528 
Lys Arg Ser Gin Asp Arg lie Arg Glu Leu Phe Ser Gin Ala Glu Ser 
165 170 175 

CAT TTT CGT AAT TCC ATG CCG TCA TTT GCA GTT TCC AAA TTC GAA GTG 576 
His Phe Arg Asn Ser Met Pro Ser Phe Ala Val Ser Lys Phe Glu Val 
180 185 190 

CTG TTT CTA CCA ACA TAT GCA CAA GCT GCA AAT ACA CAT TTA TTG CTA 624 
Leu Phe Leu Pro Thr Tyr Ala Gin Ala Ala Asn Thr His Leu Leu Leu 
195 200 205 

TTA AAA GAT GCT CAA GTT TTT GGA GAA GAA TGG GGA TAT TCT TCA GAA 672 
Leu Lys Asp Ala Gin Val Phe Gly Glu Glu Trp Gly Tyr Ser Ser Glu 
210 215 220 

GAT GTT GCT GAA TTT TAT CAT AGA CAA TTA AAA CTT ACA CAA CAA TAC 72 0 

Asp Val Ala Glu Phe Tyr His Arg Gin Leu Lys Leu Thr Gin Gin Tyr 
225 230 235 240 

ACT GAC CAT TGT GTT AAT TGG TAT AAT GTT GGA TTA AAT GGT TTA AGA 76 8 

Thr Asp His Cys Val Asn Trp Tyr Asn Val Gly Leu Asn Gly Leu Arg 
245 250 255 

GGT TCA ACT TAT GAT GCA TGG GTC AAA TTT AAC CGT TTT CGC AGA GAA 816 
Gly Ser Thr Tyr Asp Ala Trp Val Lys Phe Asn Arg Phe Arg Arg Glu 
260 265 270 

ATG ACT TTA ACT GTA TTA GAT CTA ATT GTA CTT TTC CCA TTT TAT GAT 864 
Met Thr Leu Thr Val Leu Asp Leu lie Val Leu Phe Pro Phe Tyr Asp 
275 280 285 

ATT CGG TTA TAC TCA AAA GGG GTT AAA ACA GAA CTA ACA AGA GAC ATT 912 
lie Arg Leu Tyr Ser Lys Gly Val Lys Thr Glu Leu Thr Arg Asp lie 
290 295 300 

TTT ACG GAT CCA ATT TTT ATC CTA CAT ACG CTG CAG GAG TAC GGA CCA 960 
Phe Thr Asp Pro lie Phe lie Leu His Thr Leu Gin Glu Tyr Gly Pro 
305 310 315 320 
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ACT TTT TTG AGT ATA GAA AAC TCT ATT CGA AAA CCT CAT TTA TTT GAT 1008 
Thr Phe Leu Ser lie Glu Asn Ser He Arg Lys Pro His Leu Phe Asp 
325 330 335 

TAT TTA CAG GGG ATT GAA TTT CAT ACG CGT CTT CAA CCT GGT TAC TTT 105 6 

Tyr Leu Gin Gly He Glu Phe His Thr Arg Leu Gin Pro Gly Tyr Phe 
340 345 350 

GGG AAA GAT TCT TTC AAT TAT TGG TCT GGT AAT TAT GTA GAA ACT AGA 1104 
Gly Lys Asp Ser Phe Asn Tyr Trp Ser Gly Asn Tyr Val Glu Thr Arg 
355 360 365 

CCT AGT ATA GGA TCT AGT AAG ACA ATT ACT TCC CCA TTT TAT GGA GAT 1152 
Pro Ser He Gly Ser Ser Lys Thr He Thr Ser Pro Phe Tyr Gly Asp 
370 375 380 

AAA TCT ACT GAA CCT GTA CAA AAG CTA AGC TTT GAT GGA CAA AAA GTT 12 0 0 

Lys Ser Thr Glu Pro Val Gin Lys Leu Ser Phe Asp Gly Gin Lys Val 
385 390 395 400 

TAT CGA ACT ATA GCT AAT ACA GAC GTA GCG GCT TGG CCG AAT GGT AAG 1248 
Tyr Arg Thr He Ala Asn Thr Asp Val Ala Ala Trp Pro Asn Gly Lys 
405 410 415 

GTA TAT TTA GGT GTT ACG AAA GTT GAT TTT AGT CAA TAT GAT GAT CAA 12 96 

Val Tyr Leu Gly Val Thr Lys Val Asp Phe Ser Gin Tyr Asp Asp Gin 
420 425 430 

AAA AAT GAA ACT AGT ACA CAA ACA TAT GAT TCA AAA AGA AAC AAT GGC 1344 
Lys Asn Glu Thr Ser Thr Gin Thr Tyr Asp Ser Lys Arg Asn Asn Gly 
435 440 445 

CAT GTA AGT GCA CAG GAT TCT ATT GAC CAA TTA CCG CCA GAA ACA ACA 1392 
His Val Ser Ala Gin Asp Ser He Asp Gin Leu Pro Pro Glu Thr Thr 
450 455 460 

GAT GAA CCA CTT GAA AAA GCA TAT AGT CAT CAG CTT AAT TAC GCG GAA 1440 
Asp Glu Pro Leu Glu Lys Ala Tyr Ser His Gin Leu Asn Tyr Ala Glu 
465 470 475 480 

TGT TTC TTA ATG CAG GAC CGT CGT GGA ACA ATT CCA TTT TTT ACT TGG 1488 
Cys Phe Leu Met Gin Asp Arg Arg Gly Thr He Pro Phe Phe Thr Trp 
485 490 495 

ACA CAT AGA AGT GTA GAC TTT TTT AAT ACA ATT GAT GCT GAA AAG ATT 1536 
Thr His Arg Ser Val Asp Phe Phe Asn Thr He Asp Ala Glu Lys He 
500 505 510 

ACT CAA CTT CCA GTA GTG AAA GCA TAT GCC TTG TCT TCA GGT GCT TCC 1584 
Thr Gin Leu Pro Val Val Lys Ala Tyr Ala Leu Ser Ser Gly Ala Ser 
515 520 525 
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ATT ATT GAA GGT CCA GGA TTC ACA GGA GGA AAT TTA CTA TTC CTA AAA 

lie He Glu Gly Pro Gly Phe Thr Gly Gly Asn Leu Leu Phe Leu Lys 

530 535 540 

GAA TCT AGT AAT TCA ATT GCT AAA TTT AAA GTT ACA TTA AAT TCA GCA 

Glu Ser Ser Asn Ser He Ala Lys Phe Lys Val Thr Leu Asn Ser Ala 

545 550 555 560 

GCC TTG TTA CAA CGA TAT CGT GTA AGA ATA CGC TAT GCT TCT ACC ACT 

Ala Leu Leu Gin Arg Tyr Arg Val Arg lie Arg Tyr Ala Ser Thr Thr 

565 570 575 



AAC TTA CGA CTT TTT GTG CAA AAT TCA AAC AAT GAT TTT CTT GTC ATC 
Asn Leu Arg Leu Phe Val Gin Asn Ser Asn Asn Asp Phe Leu Val He 
580 585 590 



TAC ATT AAT AAA ACT ATG AAT AAA GAT GAT GAT TTA ACA TAT CAA ACA 
Tyr He Asn Lys Thr Met Asn Lys Asp Asp Asp Leu Thr Tyr Gin Thr 
595 600 605 



TTT GAT CTC GCA ACT ACT AAT TCT AAT ATG GGG TTC TCG GGT GAT AAG 
Phe Asp Leu Ala Thr Thr Asn Ser Asn Met Gly Phe Ser Gly Asp Lys 
610 615 620 

AAT GAA CTT ATA ATA GGA GCA GAA TCT TTC GTT TCT AAT GAA AAA ATC 
Asn Glu Leu He lie Gly Ala Glu Ser Phe Val Ser Asn Glu Lys lie 
625 630 635 640 



TAT ATA GAT AAG ATA GAA TTT ATC CCA GTA CAA TTG TAA 
Tyr lie Asp Lys lie Glu Phe lie Pro Val Gin Leu 
645 650 



(2) INFORMATION FOR SEQ ID NO: 34: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 652 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 34: 

Met Asn Pro Asn Asn Arg Ser Glu His Asp Thr lie Lys Val Thr Pro 
15 10 15 



Asn Ser Glu Leu 
20 

Pro Asn Ser Thr 
35 



Gin Thr Asn His 



Leu Glu Glu Leu 
40 



Asn Gin Tyr Pro 
25 

Asn Tyr Lys Glu 



Leu Ala Asp Asn 
30 

Phe Leu Arg Met 
45 
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Thr Glu Asp Ser Ser Thr Glu Val Leu Asp Asn Ser Thr Val Lys Asp 
50 55 60 



Ala Val Gly Thr Gly lie Ser Val Val Gly Gin lie Leu Gly Val Val 
65 70 75 80 

Gly Val Pro Phe Ala Gly Ala Leu Thr Ser Phe Tyr Gin Ser Phe Leu 
85 90 95 

Asn Thr lie Trp Pro Ser Asp Ala Asp Pro Trp Lys Ala Phe Met Ala 
100 105 110 

Gin Val Glu Val Leu lie Asp Lys Lys lie Glu Glu Tyr Ala Lys Ser 
115 120 125 

Lys Ala Leu Ala Glu Leu Gin Gly Leu Gin Asn Asn Phe Glu Asp Tyr 
130 135 140 

Val Asn Ala Leu Asn Ser Trp Lys Lys Thr Pro Leu Ser Leu Arg Ser 
145 150 155 160 

Lys Arg Ser Gin Asp Arg lie Arg Glu Leu Phe Ser Gin Ala Glu Ser 
165 170 175 

His Phe Arg Asn Ser Met Pro Ser Phe Ala Val Ser Lys Phe Glu Val 
180 185 190 

Leu Phe Leu Pro Thr Tyr Ala Gin Ala Ala Asn Thr His Leu Leu Leu 
195 200 205 

Leu Lys Asp Ala Gin Val Phe Gly Glu Glu Trp Gly Tyr Ser Ser Glu 
210 215 220 

Asp Val Ala Glu Phe Tyr His Arg Gin Leu Lys Leu Thr Gin Gin Tyr 
225 230 235 240 

Thr Asp His Cys Val Asn Trp Tyr Asn Val Gly Leu Asn Gly Leu Arg 
245 250 255 

Gly Ser Thr Tyr Asp Ala Trp Val Lys Phe Asn Arg Phe Arg Arg Glu 
260 265 270 

Met Thr Leu Thr Val Leu Asp Leu lie Val Leu Phe Pro Phe Tyr Asp 
275 280 285 

lie Arg Leu Tyr Ser Lys Gly Val Lys Thr Glu Leu Thr Arg Asp lie 
290 295 300 

Phe Thr Asp Pro lie Phe lie Leu His Thr Leu Gin Glu Tyr Gly Pro 
305 310 315 320 



Thr Phe Leu Ser lie Glu Asn Ser He Arg Lys Pro His Leu Phe Asp 
325 330 335 



-307- 

A 135535(2WKV0P DOC) 



Tyr Leu Gin Gly lie Glu Phe His Thr Arg Leu Gin Pro Gly Tyr Phe 
340 345 350 



Gly Lys Asp Ser Phe Asn Tyr Trp Ser Gly Asn Tyr Val Glu Thr Arg 
355 360 365 

Pro Ser lie Gly Ser Ser Lys Thr lie Thr Ser Pro Phe Tyr Gly Asp 
370 375 380 

Lys Ser Thr Glu Pro Val Gin Lys Leu Ser Phe Asp Gly Gin Lys Val 
385 390 395 400 

Tyr Arg Thr lie Ala Asn Thr Asp Val Ala Ala Trp Pro Asn Gly Lys 
405 410 415 

Val Tyr Leu Gly Val Thr Lys Val Asp Phe Ser Gin Tyr Asp Asp Gin 
420 425 430 

Lys Asn Glu Thr Ser Thr Gin Thr Tyr Asp Ser Lys Arg Asn Asn Gly 
435 440 445 

His Val Ser Ala Gin Asp Ser lie Asp Gin Leu Pro Pro Glu Thr Thr 
450 455 460 

Asp Glu Pro Leu Glu Lys Ala Tyr Ser His Gin Leu Asn Tyr Ala Glu 
465 470 475 480 

Cys Phe Leu Met Gin Asp Arg Arg Gly Thr lie Pro Phe Phe Thr Trp 
485 490 495 

Thr His Arg Ser Val Asp Phe Phe Asn Thr lie Asp Ala Glu Lys lie 
500 505 510 

Thr Gin Leu Pro Val Val Lys Ala Tyr Ala Leu Ser Ser Gly Ala Ser 
515 520 525 

He He Glu Gly Pro Gly Phe Thr Gly Gly Asn Leu Leu Phe Leu Lys 
530 535 540 

Glu Ser Ser Asn Ser He Ala Lys Phe Lys Val Thr Leu Asn Ser Ala 
545 550 555 560 

Ala Leu Leu Gin Arg Tyr Arg Val Arg He Arg Tyr Ala Ser Thr Thr 
565 570 575 

Asn Leu Arg Leu Phe Val Gin Asn Ser Asn Asn Asp Phe Leu Val lie 
580 585 590 

Tyr He Asn Lys Thr Met Asn Lys Asp Asp Asp Leu Thr Tyr Gin Thr 
595 600 605 

Phe Asp Leu Ala Thr Thr Asn Ser Asn Met Gly Phe Ser Gly Asp Lys 
610 615 620 
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Asn Glu Leu lie lie Gly Ala Glu Ser Phe Val Ser Asn Glu Lys lie 
625 630 635 640 

Tyr lie Asp Lys lie Glu Phe lie Pro Val Gin Leu 
645 650 



(2) INFORMATION FOR SEQ ID NO : 3 5 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1959 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ix) FEATURE: 

(A) NAME / KEY : CDS 

(B) LOCATION: 1. .1956 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 35: 

ATG AAT CCA AAC AAT CGA AGT GAA CAT GAT ACG ATA AAG GTT ACA CCT 4 8 

Met Asn Pro Asn Asn Arg Ser Glu His Asp Thr lie Lys Val Thr Pro 
15 10 15 

AAC AGT GAA TTG CAA ACT AAC CAT AAT CAA TAT CCT TTA GCT GAC AAT 96 
Asn Ser Glu Leu Gin Thr Asn His Asn Gin Tyr Pro Leu Ala Asp Asn 
20 25 30 

CCA AAT TCA ACA CTA GAA GAA TTA AAT TAT AAA GAA TTT TTA AGA ATG 144 
Pro Asn Ser Thr Leu Glu Glu Leu Asn Tyr Lys Glu Phe Leu Arg Met 
35 40 45 

ACT GAA GAC AGT TCT ACG GAA GTG CTA GAC AAC TCT ACA GTA AAA GAT 192 
Thr Glu Asp Ser Ser Thr Glu Val Leu Asp Asn Ser Thr Val Lys Asp 
50 55 60 

GCA GTT GGG ACA GGA ATT TCT GTT GTA GGG CAG ATT TTA GGT GTT GTA 240 
Ala Val Gly Thr Gly He Ser Val Val Gly Gin He Leu Gly Val Val 
65 70 75 80 

GGA GTT CCA TTT GCT GGG GCA CTC ACT TCA TTT TAT CAA TCA TTT CTT 288 
Gly Val Pro Phe Ala Gly Ala Leu Thr Ser Phe Tyr Gin Ser Phe Leu 
85 90 95 

AAC ACT ATA TGG CCA AGT GAT GCT GAC CCA TGG AAG GCT TTT ATG GCA 3 36 

Asn Thr He Trp Pro Ser Asp Ala Asp Pro Trp Lys Ala Phe Met Ala 
100 105 110 

CAA GTT GAA GTA CTG ATA GAT AAG AAA ATA GAG GAG TAT GCT AAA AGT 384 
Gin Val Glu Val Leu He Asp Lys Lys He Glu Glu Tyr Ala Lys Ser 
115 120 125 
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AAA GCT CTT GCA GAG TTA CAG GGT CTT CAA AAT AAT TTC GAA GAT TAT 4 32 

Lys Ala Leu Ala Glu Leu Gin Gly Leu Gin Asn Asn Phe Glu Asp Tyr 

130 135 140 

GTT AAT GCG TTA AAT TCC TGG AAG AAA ACA CCT TTA AGT TTG CGA AGT 4 80 

Val Asn Ala Leu Asn Ser Trp Lys Lys Thr Pro Leu Ser Leu Arg Ser 

145 150 155 160 

AAA AGA AGC CAA GAT CGA ATA AGG GAA CTT TTT TCT CAA GCA GAA AGT 52 8 

Lys Arg Ser Gin Asp Arg lie Arg Glu Leu Phe Ser Gin Ala Glu Ser 
165 170 175 

CAT TTT CGT AAT TCC ATG CCG TCA TTT GCA GTT TCC AAA TTC GAA GTG 5 76 

His Phe Arg Asn Ser Met Pro Ser Phe Ala Val Ser Lys Phe Glu Val 
180 185 190 

CTG TTT CTA CCA ACA TAT GCA CAA GCT GCA AAT ACA CAT TTA TTG CTA 624 

Leu Phe Leu Pro Thr Tyr Ala Gin Ala Ala Asn Thr His Leu Leu Leu 

195 200 205 

TTA AAA GAT GCT CAA GTT TTT GGA GAA GAA TGG GGA TAT TCT TCA GAA 6 72 

Leu Lys Asp Ala Gin Val Phe Gly Glu Glu Trp Gly Tyr Ser Ser Glu 

210 215 220 

GAT GTT GCT GAA TTT TAT CAT AGA CAA TTA AAA CTT ACA CAA CAA TAC 72 0 

Asp Val Ala Glu Phe Tyr His Arg Gin Leu Lys Leu Thr Gin Gin Tyr 

225 230 235 240 

ACT GAC CAT TGT GTT AAT TGG TAT AAT GTT GGA TTA AAT GGT TTA AGA 768 

Thr Asp His Cys Val Asn Trp Tyr Asn Val Gly Leu Asn Gly Leu Arg 
245 250 255 

GGT TCA ACT TAT GAT GCA TGG GTC AAA TTT AAC CGT TTT CGC AGA GAA 816 

Gly Ser Thr Tyr Asp Ala Trp Val Lys Phe Asn Arg Phe Arg Arg Glu 
260 265 270 

ATG ACT TTA ACT GTA TTA GAT CTA ATT GTA CTT TTC CCA TTT TAT GAT 864 

Met Thr Leu Thr Val Leu Asp Leu lie Val Leu Phe Pro Phe Tyr Asp 

275 280 285 

ATT CGG TTA TAC TCA AAA GGG GTT AAA ACA GAA CTA ACA AGA GAC ATT 912 

lie Arg Leu Tyr Ser Lys Gly Val Lys Thr Glu Leu Thr Arg Asp lie 

290 295 300 

TTT ACG GAT CCA ATT TTT TCC CTC GTT AAC CTA ATG GTG TAC GGA CCA 960 

Phe Thr Asp Pro lie Phe Ser Leu Val Asn Leu Met Val Tyr Gly Pro 

305 310 315 320 

ACT TTT TTG AGT ATA GAA AAC TCT ATT CGA AAA CCT CAT TTA TTT GAT 1008 

Thr Phe Leu Ser lie Glu Asn Ser lie Arg Lys Pro His Leu Phe Asp 
325 330 335 
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TAT TTA CAG GGG ATT GAA TTT CAT ACG CGT CTT CAA CCT GGT TAC TTT 10 56 

Tyr Leu Gin Gly lie Glu Phe His Thr Arg Leu Gin Pro Gly Tyr Phe 
340 345 350 

GGG AAA GAT TCT TTC AAT TAT TGG TCT GGT AAT TAT GTA GAA ACT AGA 1104 
Gly Lys Asp Ser Phe Asn Tyr Trp Ser Gly Asn Tyr Val Glu Thr Arg 
355 360 365 

CCT AGT ATA GGA TCT AGT AAG ACA ATT ACT TCC CCA TTT TAT GGA GAT 1152 
Pro Ser lie Gly Ser Ser Lys Thr lie Thr Ser Pro Phe Tyr Gly Asp 
370 375 380 

AAA TCT ACT GAA CCT GTA CAA AAG CTA AGC TTT GAT GGA CAA AAA GTT 1200 
Lys Ser Thr Glu Pro Val Gin Lys Leu Ser Phe Asp Gly Gin Lys Val 
385 390 395 400 

TAT CGA ACT ATA GCT AAT ACA GAC GTA GCG GCT TGG CCG AAT GGT AAG 1248 
Tyr Arg Thr lie Ala Asn Thr Asp Val Ala Ala Trp Pro Asn Gly Lys 
405 410 415 

GTA TAT TTA GGT GTT ACG AAA GTT GAT TTT AGT CAA TAT GAT GAT CAA 1296 
Val Tyr Leu Gly Val Thr Lys Val Asp Phe Ser Gin Tyr Asp Asp Gin 
420 425 430 

AAA AAT GAA ACT AGT ACA CAA ACA TAT GAT TCA AAA AGA AAC AAT GGC 1344 
Lys Asn Glu Thr Ser Thr Gin Thr Tyr Asp Ser Lys Arg Asn Asn Gly 
435 440 445 

CAT GTA AGT GCA CAG GAT TCT ATT GAC CAA TTA CCG CCA GAA ACA ACA 13 92 

His Val Ser Ala Gin Asp Ser lie Asp Gin Leu Pro Pro Glu Thr Thr 
450 455 460 

GAT GAA CCA CTT GAA AAA GCA TAT AGT CAT CAG CTT AAT TAC GCG GAA 1440 
Asp Glu Pro Leu Glu Lys Ala Tyr Ser His Gin Leu Asn Tyr Ala Glu 
465 470 475 480 

TGT TTC TTA ATG CAG GAC CGT CGT GGA ACA ATT CCA TTT TTT ACT TGG 1488 
Cys Phe Leu Met Gin Asp Arg Arg Gly Thr lie Pro Phe Phe Thr Trp 
485 490 495 

ACA CAT AGA AGT GTA GAC TTT TTT AAT ACA ATT GAT GCT GAA AAG ATT 1536 
Thr His Arg Ser Val Asp Phe Phe Asn Thr lie Asp Ala Glu Lys lie 
500 505 510 

ACT CAA CTT CCA GTA GTG AAA GCA TAT GCC TTG TCT TCA GGT GCT TCC 1584 
Thr Gin Leu Pro Val Val Lys Ala Tyr Ala Leu Ser Ser Gly Ala Ser 
515 520 525 

ATT ATT GAA GGT CCA GGA TTC ACA GGA GGA AAT TTA CTA TTC CTA AAA 1632 
lie lie Glu Gly Pro Gly Phe Thr Gly Gly Asn Leu Leu Phe Leu Lys 
530 535 540 
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GAA TCT AGT AAT TCA ATT GCT AAA TTT AAA GTT ACA TTA AAT TCA GCA 168 0 

Glu Ser Ser Asn Ser lie Ala Lys Phe Lys Val Thr Leu Asn Ser Ala 
545 550 555 560 

GCC TTG TTA CAA CGA TAT CGT GTA AGA ATA CGC TAT GCT TCT ACC ACT 172 8 

Ala Leu Leu Gin Arg Tyr Arg Val Arg lie Arg Tyr Ala Ser Thr Thr 
565 570 575 

AAC TTA CGA CTT TTT GTG CAA AAT TCA AAC AAT GAT TTT CTT GTC ATC 1776 
Asn Leu Arg Leu Phe Val Gin Asn Ser Asn Asn Asp Phe Leu Val lie 
580 585 590 

TAC ATT AAT AAA ACT ATG AAT AAA GAT GAT GAT TTA ACA TAT CAA ACA 1824 
Tyr lie Asn Lys Thr Met Asn Lys Asp Asp Asp Leu Thr Tyr Gin Thr 
595 600 605 

TTT GAT CTC GCA ACT ACT AAT TCT AAT ATG GGG TTC TCG GGT GAT AAG 18 72 

Phe Asp Leu Ala Thr Thr Asn Ser Asn Met Gly Phe Ser Gly Asp Lys 
610 615 620 

AAT GAA CTT ATA ATA GGA GCA GAA TCT TTC GTT TCT AAT GAA AAA ATC 192 0 

Asn Glu Leu lie lie Gly Ala Glu Ser Phe Val Ser Asn Glu Lys lie 
625 630 635 640 

TAT ATA GAT AAG ATA GAA TTT ATC CCA GTA CAA TTG TAA 1959 
Tyr lie Asp Lys lie Glu Phe He Pro Val Gin Leu 
645 650 



(2) INFORMATION FOR SEQ ID NO: 36: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 652 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 36: 

Met Asn Pro Asn Asn Arg Ser Glu His Asp Thr He Lys Val Thr Pro 
15 10 15 

Asn Ser Glu Leu Gin Thr Asn His Asn Gin Tyr Pro Leu Ala Asp Asn 
20 25 30 

Pro Asn Ser Thr Leu Glu Glu Leu Asn Tyr Lys Glu Phe Leu Arg Met 
35 40 45 

Thr Glu Asp Ser Ser Thr Glu Val Leu Asp Asn Ser Thr Val Lys Asp 
50 55 60 

Ala Val Gly Thr Gly He Ser Val Val Gly Gin He Leu Gly Val Val 
65 70 75 80 
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Gly Val Pro Phe Ala Gly Ala Leu Thr Ser Phe Tyr Gin Ser Phe Leu 
85 90 95 

Asn Thr lie Trp Pro Ser Asp Ala Asp Pro Trp Lys Ala Phe Met Ala 
100 105 110 

Gin Val Glu Val Leu lie Asp Lys Lys lie Glu Glu Tyr Ala Lys Ser 
115 120 125 

Lys Ala' Leu Ala Glu Leu Gin Gly Leu Gin Asn Asn Phe Glu Asp Tyr 
130 135 140 

Val Asn Ala Leu Asn Ser Trp Lys Lys Thr Pro Leu Ser Leu Arg Ser 
145 150 155 160 

Lys Arg Ser Gin Asp Arg lie Arg Glu Leu Phe Ser Gin Ala Glu Ser 
165 170 175 

His Phe Arg Asn Ser Met Pro Ser Phe Ala Val Ser Lys Phe Glu Val 
180 185 190 

Leu Phe Leu Pro Thr Tyr Ala Gin Ala Ala Asn Thr His Leu Leu Leu 
195 200 205 

Leu Lys Asp Ala Gin Val Phe Gly Glu Glu Trp Gly Tyr Ser Ser Glu 
210 215 220 

Asp Val Ala Glu Phe Tyr His Arg Gin Leu Lys Leu Thr Gin Gin Tyr 
225 230 235 240 

Thr Asp His Cys Val Asn Trp Tyr Asn Val Gly Leu Asn Gly Leu Arg 
245 250 255 

Gly Ser Thr Tyr Asp Ala Trp Val Lys Phe Asn Arg Phe Arg Arg Glu 
260 265 270 

Met Thr Leu Thr Val Leu Asp Leu lie Val Leu Phe Pro Phe Tyr Asp 
275 280 285 

lie Arg Leu Tyr Ser Lys Gly Val Lys Thr Glu Leu Thr Arg Asp lie 
290 295 300 

Phe Thr Asp Pro lie Phe Ser Leu Val Asn Leu Met Val Tyr Gly Pro 
305 310 315 320 

Thr Phe Leu Ser lie Glu Asn Ser lie Arg Lys Pro His Leu Phe Asp 
325 330 335 

Tyr Leu Gin Gly lie Glu Phe His Thr Arg Leu Gin Pro Gly Tyr Phe 
340 345 350 

Gly Lys Asp Ser Phe Asn Tyr Trp Ser Gly Asn Tyr Val Glu Thr Arg 
355 360 365 
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Pro Ser lie Gly Ser Ser Lys Thr lie Thr Ser Pro Phe Tyr Gly Asp 
370 375 380 



Lys Ser Thr Glu Pro Val Gin Lys Leu Ser Phe Asp Gly Gin Lys Val 
385 390 395 400 

Tyr Arg Thr lie Ala Asn Thr Asp Val Ala Ala Trp Pro Asn Gly Lys 
405 410 415 

Val Tyr Leu Gly Val Thr Lys Val Asp Phe Ser Gin Tyr Asp Asp Gin 
420 425 430 

Lys Asn Glu Thr Ser Thr Gin Thr Tyr Asp Ser Lys Arg Asn Asn Gly 
435 440 445 

His Val Ser Ala Gin Asp Ser lie Asp Gin Leu Pro Pro Glu Thr Thr 
450 455 460 

Asp Glu Pro Leu Glu Lys Ala Tyr Ser His Gin Leu Asn Tyr Ala Glu 
465 470 475 480 

Cys Phe Leu Met Gin Asp Arg Arg Gly Thr lie Pro Phe Phe Thr Trp 
485 490 495 

Thr His Arg Ser Val Asp Phe Phe Asn Thr lie Asp Ala Glu Lys lie 
500 505 510 

Thr Gin Leu Pro Val Val Lys Ala Tyr Ala Leu Ser Ser Gly Ala Ser 
515 520 525 

lie lie Glu Gly Pro Gly Phe Thr Gly Gly Asn Leu Leu Phe Leu Lys 
530 535 540 

Glu Ser Ser Asn Ser lie Ala Lys Phe Lys Val Thr Leu Asn Ser Ala 
545 550 555 560 

Ala Leu Leu Gin Arg Tyr Arg Val Arg lie Arg Tyr Ala Ser Thr Thr 
565 570 575 

Asn Leu Arg Leu Phe Val Gin Asn Ser Asn Asn Asp Phe Leu Val lie 
580 585 590 

Tyr lie Asn Lys Thr Met Asn Lys Asp Asp Asp Leu Thr Tyr Gin Thr 
595 600 605 

Phe Asp Leu Ala Thr Thr Asn Ser Asn Met Gly Phe Ser Gly Asp Lys 
610 615 620 

Asn Glu Leu lie lie Gly Ala Glu Ser Phe Val Ser Asn Glu Lys lie 
625 630 635 640 

Tyr lie Asp Lys lie Glu Phe lie Pro Val Gin Leu 
645 650 
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(2) INFORMATION FOR SEQ ID NO: 37: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1959 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ix) FEATURE: 

(A) NAME / KEY : CDS 

(B) LOCATION: 1. . 1956 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:37: 

ATG AAT CCA AAC AAT CGA AGT GAA CAT GAT ACG ATA AAG GTT ACA CCT 4 8 

Met Asn Pro Asn Asn Arg Ser Glu His Asp Thr lie Lys Val Thr Pro 
15 10 15 

AAC AGT GAA TTG CAA ACT AAC CAT AAT CAA TAT CCT TTA GCT GAC AAT 96 
Asn Ser Glu Leu Gin Thr Asn His Asn Gin Tyr Pro Leu Ala Asp Asn 
20 25 30 

CCA AAT TCA ACA CTA GAA GAA TTA AAT TAT AAA GAA TTT TTA AGA ATG 144 
Pro Asn Ser Thr Leu Glu Glu Leu Asn Tyr Lys Glu Phe Leu Arg Met 
35 40 45 

ACT GAA GAC AGT TCT ACG GAA GTG CTA GAC AAC TCT ACA GTA AAA GAT 192 
Thr Glu Asp Ser Ser Thr Glu Val Leu Asp Asn Ser Thr Val Lys Asp 
50 55 60 

GCA GTT GGG ACA GGA ATT TCT GTT GTA GGG CAG ATT TTA GGT GTT GTA 240 
Ala Val Gly Thr Gly He Ser Val Val Gly Gin lie Leu Gly Val Val 
65 70 75 80 

GGA GTT CCA TTT GCT GGG GCA CTC ACT TCA TTT TAT CAA TCA TTT CTT 288 
Gly Val Pro Phe Ala Gly Ala Leu Thr Ser Phe Tyr Gin Ser Phe Leu 
85 90 95 

AAC ACT ATA TGG CCA AGT GAT GCT GAC CCA TGG AAG GCT TTT ATG GCA 3 36 

Asn Thr He Trp Pro- Ser Asp Ala Asp Pro Trp Lys Ala Phe Met Ala 
100 105 110 

CAA GTT GAA GTA CTG ATA GAT AAG AAA ATA GAG GAG TAT GCT AAA AGT 384 
Gin Val Glu Val Leu He Asp Lys Lys He Glu Glu Tyr Ala Lys Ser 
115 120 125 

AAA GCT CTT GCA GAG TTA CAG GGT CTT CAA AAT AAT TTC GAA GAT TAT 4 32 

Lys Ala Leu Ala Glu Leu Gin Gly Leu Gin Asn Asn Phe Glu Asp Tyr 
130 135 140 
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GTT AAT GCG TTA AAT TCC TGG AAG AAA ACA CCT TTA AGT TTG CGA AGT 480 

Val Asn Ala Leu Asn Ser Trp Lys Lys Thr Pro Leu Ser Leu Arg Ser 

145 150 155 . 160 

AAA AGA AGC CAA GAT CGA ATA AGG GAA CTT TTT TCT CAA GCA GAA AGT 528 

Lys Arg Ser Gin Asp Arg lie Arg Glu Leu Phe Ser Gin Ala Glu Ser 
165 170 175 

CAT TTT CGT AAT TCC ATG CCG TCA TTT GCA GTT TCC AAA TTC GAA GTG 5 76 

His Phe Arg Asn Ser Met Pro Ser Phe Ala Val Ser Lys Phe Glu Val 
180 185 190 

CTG TTT CTA CCA ACA TAT GCA CAA GCT GCA AAT ACA CAT TTA TTG CTA 624 

Leu Phe Leu Pro Thr Tyr Ala Gin Ala Ala Asn Thr His Leu Leu Leu 
195 200 205 

TTA AAA GAT GCT CAA GTT TTT GGA GAA GAA TGG GGA TAT TCT TCA GAA 6 72 

Leu Lys Asp Ala Gin Val Phe Gly Glu Glu Trp Gly Tyr Ser Ser Glu 
210 215 220 

GAT GTT GCT GAA TTT TAT CAT AGA CAA TTA AAA CTT ACA CAA CAA TAC 720 

Asp Val Ala Glu Phe Tyr His Arg Gin Leu Lys Leu Thr Gin Gin Tyr 

225 230 235 240 

ACT GAC CAT TGT GTT AAT TGG TAT AAT GTT GGA TTA AAT GGT TTA AGA 768 

Thr Asp His Cys Val Asn Trp Tyr Asn Val Gly Leu Asn Gly Leu Arg 
245 250 255" 

GGT TCA ACT TAT GAT GCA TGG GTC AAA TTT AAC CGT TTT CGC AGA GAA 816 

Gly Ser Thr Tyr Asp Ala Trp Val Lys Phe Asn Arg Phe Arg Arg Glu 
260 265 270 

ATG ACT TTA ACT GTA TTA GAT CTA ATT GTA CTT TTC CCA TTT TAT GAT 864 

Met Thr Leu Thr Val Leu Asp Leu lie Val Leu Phe Pro Phe Tyr Asp 
275 280 285 

ATT CGG TTA TAC TCA AAA GGG GTT AAA ACA GAA CTA ACA AGA GAC ATT 912 

lie Arg Leu Tyr Ser Lys Gly Val Lys Thr Glu Leu Thr Arg Asp lie 
290 295 300 

TTT ACG GAT CCA ATT TTT TCT CTT AGG ACA CCA CTT GCG TAC GGA CCA 960 

Phe Thr Asp Pro lie Phe Ser Leu Arg Thr Pro Leu Ala Tyr Gly Pro 

305 310 315 320 

ACT TTT TTG AGT ATA GAA AAC TCT ATT CGA AAA CCT CAT TTA TTT GAT 1008 

Thr Phe Leu Ser lie Glu Asn Ser lie Arg Lys Pro His Leu Phe Asp 
325 330 335 

TAT TTA CAG GGG ATT GAA TTT CAT ACG CGT CTT CAA CCT GGT TAC TTT 1056 

Tyr Leu Gin Gly lie Glu Phe His Thr Arg Leu Gin Pro Gly Tyr Phe 
340 345 350 
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GGG AAA GAT TCT TTC AAT TAT TGG TCT GGT AAT TAT GTA GAA ACT AGA 1104 
Gly Lys Asp Ser Phe Asn Tyr Trp Ser Gly Asn Tyr Val Glu Thr Arg 
355 360 365 

CCT AGT ATA GGA TCT AGT AAG ACA ATT ACT TCC CCA TTT TAT GGA GAT 1152 
Pro Ser lie Gly Ser Ser Lys Thr lie Thr Ser Pro Phe Tyr Gly Asp 
370 375 380 

AAA TCT ACT GAA CCT GTA CAA AAG CTA AGC TTT GAT GGA CAA AAA GTT 12 00 

Lys Ser Thr Glu Pro Val Gin Lys Leu Ser Phe Asp Gly Gin Lys Val 
385 390 395 400 

TAT CGA ACT ATA GCT AAT ACA GAC GTA GCG GCT TGG CCG AAT GGT AAG 124 8 

Tyr Arg Thr lie Ala Asn Thr Asp Val Ala Ala Trp Pro Asn Gly Lys 
405 410 415 

GTA TAT TTA GGT GTT ACG AAA GTT GAT TTT AGT CAA TAT GAT GAT CAA 12 96 

Val Tyr Leu Gly Val Thr Lys Val Asp Phe Ser Gin Tyr Asp Asp Gin 
420 425 430 

AAA AAT GAA ACT AGT ACA CAA ACA TAT GAT TCA AAA AGA AAC AAT GGC 1344 
Lys Asn Glu Thr Ser Thr Gin Thr Tyr Asp Ser Lys Arg Asn Asn Gly 
435 440 445 

CAT GTA AGT GCA CAG GAT TCT ATT GAC CAA TTA CCG CCA GAA ACA ACA 13 92 

His Val Ser Ala Gin Asp Ser lie Asp Gin Leu Pro Pro Glu Thr Thr 
450 455 460 

GAT GAA CCA CTT GAA AAA GCA TAT AGT CAT CAG CTT AAT TAC GCG GAA 1440 
Asp Glu Pro Leu Glu Lys Ala Tyr Ser His Gin Leu Asn Tyr Ala Glu 
465 470 475 480 

TGT TTC TTA ATG CAG GAC CGT CGT GGA ACA ATT CCA TTT TTT ACT TGG 14 88 

Cys Phe Leu Met Gin Asp Arg Arg Gly Thr lie Pro Phe Phe Thr Trp 
485 490 495 

ACA CAT AGA AGT GTA GAC TTT TTT AAT ACA ATT GAT GCT GAA AAG ATT 1536 
Thr His Arg Ser Val Asp Phe Phe Asn Thr lie Asp Ala Glu Lys lie 
500 505 510 

ACT CAA CTT CCA GTA GTG AAA GCA TAT GCC TTG TCT TCA GGT GCT TCC 1584 
Thr Gin Leu Pro Val Val Lys Ala Tyr Ala Leu Ser Ser Gly Ala Ser 
515 520 525 

ATT ATT GAA GGT CCA GGA TTC ACA GGA GGA AAT TTA CTA TTC CTA AAA 16 32 

lie lie Glu Gly Pro Gly Phe Thr Gly Gly Asn Leu Leu Phe Leu Lys 
530 535 540 

GAA TCT AGT AAT TCA ATT GCT AAA TTT AAA GTT ACA TTA AAT TCA GCA 1680 
Glu Ser Ser Asn Ser lie Ala Lys Phe Lys Val Thr Leu Asn Ser Ala 
545 550 555 560 
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GCC TTG TTA CAA CGA TAT CGT GTA AGA ATA CGC TAT GCT TCT ACC ACT 
Ala Leu Leu Gin Arg Tyr Arg Val Arg lie Arg Tyr Ala Ser Thr Thr 
565 570 575 



1728 



AAC TTA CGA CTT TTT GTG CAA AAT TCA AAC AAT GAT TTT CTT GTC ATC 17 76 

Asn Leu Arg Leu Phe Val Gin Asn Ser Asn Asn Asp Phe Leu Val lie 

580 585 590 

TAC ATT AAT AAA ACT ATG AAT AAA GAT GAT GAT TTA ACA TAT CAA AC A 18 24 

Tyr lie Asn Lys Thr Met Asn Lys Asp Asp Asp Leu Thr Tyr Gin Thr 
595 600 605 

TTT GAT CTC GCA ACT ACT AAT TCT AAT ATG GGG TTC TCG GGT GAT AAG 18 72 

Phe Asp Leu Ala Thr Thr Asn Ser Asn Met Gly Phe Ser Gly Asp Lys 
610 615 620 

AAT GAA CTT ATA ATA GGA GCA GAA TCT TTC GTT TCT AAT GAA AAA ATC 1920 

Asn Glu Leu lie lie Gly Ala Glu Ser Phe Val Ser Asn Glu Lys lie 
*625 630 635 640 

TAT ATA GAT AAG ATA GAA TTT ATC CCA GTA CAA TTG TAA 195 9 

Tyr lie Asp Lys lie Glu Phe lie Pro Val Gin Leu 
645 650 



(2) INFORMATION FOR SEQ ID NO: 38: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6 52 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE : protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 38: 

Met Asn Pro Asn Asn Arg Ser Glu His Asp Thr lie Lys Val Thr Pro 
15 10 15 

Asn Ser Glu Leu Gin Thr Asn His Asn Gin Tyr Pro Leu Ala Asp Asn 
20 25 30 

Pro Asn Ser Thr Leu Glu Glu Leu Asn Tyr Lys Glu Phe Leu Arg Met 
35 40 45 

Thr Glu Asp Ser Ser Thr Glu Val Leu Asp Asn Ser Thr Val Lys Asp 
50 55 60 

Ala Val Gly Thr Gly He Ser Val Val Gly Gin He Leu Gly Val Val 
65 70 75 80 

Gly Val Pro Phe Ala Gly Ala Leu Thr Ser Phe Tyr Gin Ser Phe Leu 
85 90 95 
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Asn Thr lie Trp Pro Ser Asp Ala Asp Pro Trp Lys Ala Phe Met Ala 
100 105 110 



Gin Val Glu Val Leu lie Asp Lys Lys lie Glu Glu Tyr Ala Lys Ser 
115 120 125 

Lys Ala Leu Ala Glu Leu Gin Gly Leu Gin Asn Asn Phe Glu Asp Tyr 
130 135 140 

Val Asn Ala Leu Asn Ser Trp Lys Lys Thr Pro Leu Ser Leu Arg Ser 
145 150 155 160 

Lys Arg Ser Gin Asp Arg lie Arg Glu Leu Phe Ser Gin Ala Glu Ser 
165 170 175 

His Phe Arg Asn Ser Met Pro Ser Phe Ala Val Ser Lys Phe Glu Val 
180 185 190 

Leu Phe Leu Pro Thr Tyr Ala Gin Ala Ala Asn Thr His Leu Leu Leu 
195 200 205 

Leu Lys Asp Ala Gin Val Phe Gly Glu Glu Trp Gly Tyr Ser Ser Glu 
210 215 220 

Asp Val Ala Glu Phe Tyr His Arg Gin Leu Lys Leu Thr Gin Gin Tyr 
225 230 235 240 

Thr Asp His Cys Val Asn Trp Tyr Asn Val Gly Leu Asn Gly Leu Arg 
245 250 255 



Gly Ser Thr Tyr Asp Ala Trp Val 
260 

Met Thr Leu Thr Val Leu Asp Leu 
275 280 

lie Arg Leu Tyr Ser Lys Gly Val 
290 295 

Phe Thr Asp Pro lie Phe Ser Leu 
305 310 

Thr Phe Leu Ser lie Glu Asn Ser 
325 



Lys Phe Asn Arg Phe Arg Arg Glu 
265 270 

lie Val Leu Phe Pro Phe Tyr Asp 
285 

Lys Thr Glu Leu Thr Arg Asp lie 
300 

Arg Thr Pro Leu Ala Tyr Gly Pro 
315 320 

lie Arg Lys Pro His Leu Phe Asp 
330 335 



Tyr Leu Gin Gly lie Glu Phe His Thr Arg Leu Gin Pro Gly Tyr Phe 
340 . 345 350 

Gly Lys Asp Ser Phe Asn Tyr Trp Ser Gly Asn Tyr Val Glu Thr Arg 
355 360 365 

Pro Ser lie Gly Ser Ser Lys Thr lie Thr Ser Pro Phe Tyr Gly Asp 
370 375 380 
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Lys Ser Thr Glu Pro Val Gin Lys Leu Ser Phe Asp Gly Gin Lys Val 
385 390 395 400 



Tyr Arg Thr lie Ala Asn Thr Asp Val Ala Ala Trp Pro Asn Gly Lys 
405 410 415 

Val Tyr Leu Gly Val Thr Lys Val Asp Phe Ser Gin Tyr Asp Asp Gin 
420 425 430 

Lys Asn Glu Thr Ser Thr Gin Thr Tyr Asp Ser Lys Arg Asn Asn Gly 
435 440 445 

His Val Ser Ala Gin Asp Ser lie Asp Gin Leu Pro Pro Glu Thr Thr 
450 455 460 

Asp Glu Pro Leu Glu Lys Ala Tyr Ser His Gin Leu Asn Tyr Ala Glu 
465 470 475 480 

Cys Phe Leu Met Gin Asp Arg Arg Gly Thr lie Pro Phe Phe Thr Trp 
485 490 495 

Thr His Arg Ser Val Asp Phe Phe Asn Thr lie Asp Ala Glu Lys lie 
500 505 510 

Thr Gin Leu Pro Val Val Lys Ala Tyr Ala Leu Ser Ser Gly Ala Ser 
515 520 525 

lie lie Glu Gly Pro Gly Phe Thr Gly Gly Asn Leu Leu Phe Leu Lys 
530 535 540 

Glu Ser Ser Asn Ser lie Ala Lys Phe Lys Val Thr Leu Asn Ser Ala 
545 550 555 560 

Ala Leu Leu Gin Arg Tyr Arg Val Arg lie Arg Tyr Ala Ser Thr Thr 
565 570 575 

Asn Leu Arg Leu Phe Val Gin Asn Ser Asn Asn Asp Phe Leu Val lie 
580 585 590 

Tyr lie Asn Lys Thr Met Asn Lys Asp Asp Asp Leu Thr Tyr Gin Thr 
595 600 605 

Phe Asp Leu Ala Thr Thr Asn Ser Asn Met Gly Phe Ser Gly Asp Lys 
610 615 620 

Asn Glu Leu lie lie Gly Ala Glu Ser Phe Val Ser Asn Glu Lys lie 
625 630 635 640 

Tyr lie Asp Lys lie Glu Phe lie Pro Val Gin Leu 
645 650 
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(2) INFORMATION FOR SEQ ID NO: 39: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1959 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ix) FEATURE: 

(A) NAME /KEY : CDS 

(B) LOCATION: 1. . 1956 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 39: 

ATG AAT CCA AAC AAT CGA AGT GAA CAT GAT ACG ATA AAG GTT ACA CCT 4 8 

Met Asn Pro Asn Asn Arg Ser Glu His Asp Thr He Lys Val Thr Pro 
15 10 15 

AAC AGT GAA TTG CAA ACT AAC CAT AAT CAA TAT CCT TTA GCT GAC AAT 96 
Asn Ser Glu Leu Gin Thr Asn His Asn Gin Tyr Pro Leu Ala Asp Asn 
20 25 30 

CCA AAT TCA ACA CTA GAA GAA TTA AAT TAT AAA GAA TTT TTA AGA ATG 144 
Pro Asn Ser Thr Leu Glu Glu Leu Asn Tyr Lys Glu Phe Leu Arg Met 
35 40 45 

ACT GAA GAC AGT TCT ACG GAA GTG CTA GAC AAC TCT ACA GTA AAA GAT 192 
Thr Glu Asp Ser Ser Thr Glu Val Leu Asp Asn Ser Thr Val Lys Asp 
50 55 60 

GCA GTT GGG ACA GGA ATT TCT GTT GTA GGG CAG ATT TTA GGT GTT GTA 240 
Ala Val Gly Thr Gly He Ser Val Val Gly Gin He Leu Gly Val Val 
65 70 75 80 

GGA GTT CCA TTT GCT GGG GCA CTC ACT TCA TTT TAT CAA TCA TTT CTT 2 88 

Gly Val Pro Phe Ala Gly Ala Leu Thr Ser Phe Tyr Gin Ser Phe Leu 
85 90 95 

AAC ACT ATA TGG CCA AGT GAT GCT GAC CCA TGG AAG GCT TTT ATG GCA 336 
Asn Thr He Trp Pro Ser Asp Ala Asp Pro Trp Lys Ala Phe Met Ala 
100 . 105 110 

CAA GTT GAA GTA CTG ATA GAT AAG AAA ATA GAG GAG TAT GCT AAA AGT 384 
Gin Val Glu Val Leu He Asp Lys Lys lie Glu Glu Tyr Ala Lys Ser 
115 120 125 

AAA GCT CTT GCA GAG TTA CAG GGT CTT CAA AAT AAT TTC GAA GAT TAT 4 32 

Lys Ala Leu Ala Glu Leu Gin Gly Leu Gin Asn Asn Phe Glu Asp Tyr 
130 135 140 

GTT AAT GCG TTA AAT TCC TGG AAG AAA ACA CCT TTA AGT TTG CGA AGT 480 
Val Asn Ala Leu Asn Ser Trp Lys Lys Thr Pro Leu Ser Leu Arg Ser 
145 150 155 160 
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AAA AGA AGC CAA GAT CGA ATA AGG GAA CTT TTT TCT CAA GCA GAA AGT 523 
Lys Arg Ser Gin Asp Arg lie Arg Glu Leu Phe Ser Gin Ala Glu Ser 
165 170 175 

CAT TTT CGT AAT TCC ATG CCG TCA TTT GCA GTT TCC AAA TTC GAA GTG 5 76 

His Phe Arg Asn Ser Met Pro Ser Phe Ala Val Ser Lys Phe Glu Val 
180 185 190 

CTG TTT CTA CCA AC A TAT GCA CAA GCT GCA AAT ACA CAT TTA TTG CTA 624 
Leu Phe Leu Pro Thr Tyr Ala Gin Ala Ala Asn Thr His Leu Leu Leu 
195 200 205 

TTA AAA GAT GCT CAA GTT TTT GGA GAA GAA TGG GGA TAT TCT TCA GAA 6 72 

Leu Lys Asp Ala Gin Val Phe Gly Glu Glu Trp Gly Tyr Ser Ser Glu 
210 215 220 

GAT GTT GCT GAA TTT TAT CAT AGA CAA TTA AAA CTT ACA CAA CAA TAC 72 0 

Asp Val Ala Glu Phe Tyr His Arg Gin Leu Lys Leu Thr Gin Gin Tyr 
225 230 235 240 

ACT GAC CAT TGT GTT AAT TGG TAT AAT GTT GGA TTA AAT GGT TTA AGA 76 8 

Thr Asp His Cys Val Asn Trp Tyr Asn Val Gly Leu Asn Gly Leu Arg 
245 250 255 

GGT TCA ACT TAT GAT GCA TGG GTC AAA TTT AAC CGT TTT CGC AGA GAA 816 
Gly Ser Thr Tyr Asp Ala Trp Val Lys Phe Asn Arg Phe Arg Arg Glu 
260 265 270 

ATG ACT TTA ACT GTA TTA GAT CTA ATT GTA CTT TTC CCA TTT TTC AAT 864 
Met Thr Leu Thr Val Leu Asp Leu lie Val Leu Phe Pro Phe Phe Asn 
275 280 285 

ATT TTG CTT TAC AGT AAA GGG GTT AAA ACA GAA CTA ACA AGA GAC ATT 912 
lie Leu Leu Tyr Ser Lys Gly Val Lys Thr Glu Leu Thr Arg Asp lie 
290 295 300 

TTT ACG GAT CCA ATT TTT TCA CTT AAT ACT CTT CAG GAG TAT GGA CCA 960 
Phe Thr Asp Pro He Phe Ser Leu Asn Thr Leu Gin Glu Tyr Gly Pro 
305 310 315 320 

ACT TTT TTG AGT ATA GAA AAC TCT ATT CGA AAA CCT CAT TTA TTT GAT 1008 
Thr Phe Leu Ser He Glu Asn Ser He Arg Lys Pro His Leu Phe Asp 
325 330 335 

TAT TTA CAG GGG ATT GAA TTT CAT ACG CGT CTT CAA CCT GGT TAC TTT 10 56 

Tyr Leu Gin Gly He Glu Phe His Thr Arg Leu Gin Pro Gly Tyr Phe 
340 345 350 

GGG AAA GAT TCT TTC AAT TAT TGG TCT GGT AAT TAT GTA GAA ACT AGA 1104 
Gly Lys Asp Ser Phe Asn Tyr Trp Ser Gly Asn Tyr Val Glu Thr Arg 
355 360 365 
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CCT AGT ATA GGA TCT AGT AAG ACA ATT ACT TCC CCA TTT TAT GGA GAT 1152 
Pro Ser lie Gly Ser Ser Lys Thr He Thr Ser Pro Phe Tyr Gly Asp 
370 375 380 

AAA TCT ACT GAA CCT GTA CAA AAG CTA AGC TTT GAT GGA CAA AAA GTT 12 00 

Lys Ser Thr Glu Pro Val Gin Lys Leu Ser Phe Asp Gly Gin Lys Val 
385 390 395 400 

TAT CGA ACT ATA GCT AAT ACA GAC GTA GCG GCT TGG CCG AAT GGT AAG 124 8 

Tyr Arg Thr He Ala Asn Thr Asp Val Ala Ala Trp Pro Asn Gly Lys 
405 410 415 

GTA TAT TTA GGT GTT ACG AAA GTT GAT TTT AGT CAA TAT GAT GAT CAA 12 96 

Val Tyr Leu Gly Val Thr Lys Val Asp Phe Ser Gin Tyr Asp Asp Gin 
420 425 430 

AAA AAT GAA ACT AGT ACA CAA ACA TAT GAT TCA AAA AGA AAC AAT GGC 1344 
Lys Asn Glu Thr Ser Thr Gin Thr Tyr Asp Ser Lys Arg Asn Asn Gly 
435 440 445 

CAT GTA AGT GCA CAG GAT TCT ATT GAC CAA TTA CCG CCA GAA ACA ACA 13 92 

His Val Ser Ala Gin Asp Ser He Asp Gin Leu Pro Pro Glu Thr Thr 
450 455 460 

GAT GAA CCA CTT GAA AAA GCA TAT AGT CAT CAG CT.T AAT TAC GCG GAA 144 0 

Asp Glu Pro Leu Glu Lys Ala Tyr Ser His Gin Leu Asn Tyr Ala Glu 
465 470 475 480 

TGT TTC TTA ATG CAG GAC CGT CGT GGA ACA ATT CCA TTT TTT ACT TGG 1488 
Cys Phe Leu Met Gin Asp Arg Arg Gly Thr He Pro Phe Phe Thr Trp 
485 490 495 

ACA CAT AGA AGT GTA GAC TTT TTT AAT ACA ATT GAT GCT GAA AAG ATT ■ 1536 

Thr His Arg Ser Val Asp Phe Phe Asn Thr He Asp Ala Glu Lys He 
500 505 510 

ACT CAA CTT CCA GTA GTG AAA GCA TAT GCC TTG TCT TCA GGT GCT TCC 1584 
Thr Gin Leu Pro Val Val Lys Ala Tyr Ala Leu Ser Ser Gly Ala Ser 
515 520 525 

ATT ATT GAA GGT CCA GGA TTC ACA GGA GGA AAT TTA CTA TTC CTA AAA 16 32 

He He Glu Gly Pro Gly Phe Thr Gly Gly Asn Leu Leu Phe Leu Lys 
530 535 540 

GAA TCT AGT AAT TCA ATT GCT AAA TTT AAA GTT ACA TTA AAT TCA GCA 16 80 

Glu Ser Ser Asn Ser He Ala Lys Phe Lys Val Thr Leu Asn Ser Ala 
545 550 555 560 

GCC TTG TTA CAA CGA TAT CGT GTA AGA ATA CGC TAT GCT TCT ACC ACT 17 2 8 

Ala Leu Leu Gin Arg Tyr Arg Val Arg He Arg Tyr Ala Ser Thr Thr 
565 570 575 
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AAC TTA CGA CTT TTT GTG CAA AAT TCA AAC AAT GAT TTT CTT GTC ATC 1776 
Asn Leu Arg Leu Phe Val Gin Asn Ser Asn Asn Asp Phe Leu Val lie 
580 585 590 



TAC ATT AAT AAA ACT ATG AAT AAA GAT GAT GAT TTA AC A TAT CAA AC A 1824 
Tyr lie Asn Lys Thr Met Asn Lys Asp Asp Asp Leu Thr Tyr Gin Thr 
595 600 605 

TTT GAT CTC GCA ACT ACT AAT TCT AAT ATG GGG TTC TCG GGT GAT AAG 1872 
Phe Asp Leu Ala Thr Thr Asn Ser Asn Met Gly Phe Ser Gly Asp Lys 
610 615 620 

AAT GAA CTT ATA ATA GGA GCA GAA TCT TTC GTT TCT AAT GAA AAA ATC 1920 
Asn Glu Leu lie lie Gly Ala Glu Ser Phe Val Ser Asn Glu Lys lie 
625 630 635 640 

TAT ATA GAT AAG ATA GAA TTT ATC CCA GTA CAA TTG TAA 195 9 

Tyr lie Asp Lys lie Glu Phe lie Pro Val Gin Leu 
645 650 



(2) INFORMATION FOR SEQ ID NO: 40: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6 52 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 40: 

Met Asn Pro Asn Asn Arg Ser Glu His Asp Thr lie Lys Val Thr Pro 
1 5 10 15 

Asn Ser Glu Leu Gin Thr Asn His Asn Gin Tyr Pro Leu Ala Asp Asn 
20 25 30 

Pro Asn Ser Thr Leu Glu Glu Leu Asn Tyr Lys Glu Phe Leu Arg Met 
35 40 45 

Thr Glu Asp Ser Ser Thr Glu Val Leu Asp Asn Ser Thr Val Lys Asp 
50 55 60 

Ala Val Gly Thr Gly He Ser Val Val Gly Gin He Leu Gly Val Val 
65 70 75 80 

Gly Val Pro Phe Ala Gly Ala Leu Thr Ser Phe Tyr Gin Ser Phe Leu 
85 90 95 

Asn Thr He Trp Pro Ser Asp Ala Asp Pro Trp Lys Ala Phe Met Ala 
100 105 110 
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Gin Val Glu Val Leu lie Asp Lys Lys He Glu Glu Tyr Ala Lys Ser 
115 120 125 



Lys Ala Leu Ala Glu Leu Gin Gly Leu Gin Asn Asn Phe Glu Asp Tyr 
130 135 140 

Val Asn Ala Leu Asn Ser Trp Lys Lys Thr Pro Leu Ser Leu Arg Ser 
145 150 155 160 

Lys Arg Ser Gin Asp Arg He Arg Glu Leu Phe Ser Gin Ala Glu Ser 
165 170 175 

His Phe Arg Asn Ser Met Pro Ser Phe Ala Val Ser Lys Phe Glu Val 
180 185 190 

Leu Phe Leu Pro Thr Tyr Ala Gin Ala Ala Asn Thr His Leu Leu Leu 
195 200 205 

Leu Lys Asp Ala Gin Val Phe Gly Glu Glu Trp Gly Tyr Ser Ser Glu 
210 215 220 

Asp Val Ala Glu Phe Tyr His Arg Gin Leu Lys Leu Thr Gin Gin Tyr 
225 230 235 240 

Thr Asp His Cys Val Asn Trp Tyr Asn Val Gly Leu Asn Gly Leu Arg 
245 250 255 

Gly Ser Thr Tyr Asp Ala Trp Val Lys Phe Asn Arg Phe Arg Arg Glu 
260 265 270 

Met Thr Leu Thr Val Leu Asp Leu He Val Leu Phe Pro Phe Phe Asn 
275 280 285 

He Leu Leu Tyr Ser Lys Gly Val Lys Thr Glu Leu Thr Arg Asp He 
290 295 300 

Phe Thr Asp Pro He Phe Ser Leu Asn Thr Leu Gin Glu Tyr Gly Pro 
305 310 315 320 

Thr Phe Leu Ser He Glu Asn Ser He Arg Lys Pro His Leu Phe Asp 
325 330 335 

Tyr Leu Gin Gly He Glu Phe His Thr Arg Leu Gin Pro Gly Tyr Phe 
340 345 350 

Gly Lys Asp Ser Phe Asn Tyr Trp Ser Gly Asn Tyr Val Glu Thr Arg 
355 360 365 

Pro Ser He Gly Ser Ser Lys Thr He Thr Ser Pro Phe Tyr Gly Asp 
370 375 380 

Lys Ser Thr Glu Pro Val Gin Lys Leu Ser Phe Asp Gly Gin Lys Val 
385 390 395 400 



-325- 

A. 135535(2WKVOl« DOC) 



Tyr Arg Thr lie Ala Asn Thr Asp Val Ala Ala Trp Pro Asn Gly Lys 
405 410 415 



Val Tyr Leu Gly Val Thr Lys Val Asp Phe Ser Gin Tyr Asp Asp Gin 
420 425 430 

Lys Asn Glu Thr Ser Thr Gin Thr Tyr Asp Ser Lys Arg Asn Asn Gly 
435 440 445 

His Val Ser Ala Gin Asp Ser lie Asp Gin Leu Pro Pro Glu Thr Thr 
450 455 460 

Asp Glu Pro Leu Glu Lys Ala Tyr Ser His Gin Leu Asn Tyr Ala Glu 
465 470 475 480 

Cys Phe Leu Met Gin Asp Arg Arg Gly Thr lie Pro Phe Phe Thr Trp 
485 490 495 

Thr His Arg Ser Val Asp Phe Phe Asn Thr lie Asp Ala Glu Lys lie 
500 505 510 

Thr Gin Leu Pro Val Val Lys Ala Tyr Ala Leu Ser Ser Gly Ala Ser 
515 520 525 

lie lie Glu Gly Pro Gly Phe Thr Gly Gly Asn Leu Leu Phe Leu Lys 
530 535 540 

Glu Ser Ser Asn Ser lie Ala Lys Phe Lys Val Thr Leu Asn Ser Ala 
545 550 555 560 

Ala Leu Leu Gin Arg Tyr Arg Val Arg lie Arg Tyr Ala Ser Thr Thr 
565 570 575 

Asn Leu Arg Leu Phe Val Gin Asn Ser Asn Asn Asp Phe Leu Val lie 
580 585 590 

Tyr lie Asn Lys Thr Met Asn Lys Asp Asp Asp Leu Thr Tyr Gin Thr 
595 600 605 

Phe Asp Leu Ala Thr Thr Asn Ser Asn Met Gly Phe Ser Gly Asp Lys 
610 615 620 

Asn Glu Leu lie lie Gly Ala Glu Ser Phe Val Ser Asn Glu Lys lie 
625 630 635 640 

Tyr lie Asp Lys lie Glu Phe lie Pro Val Gin Leu 
645 650 



(2) INFORMATION FOR SEQ ID NO: 41: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1959 base pairs 

(B) TYPE: nucleic acid 
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(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(ix) FEATURE: 

(A) NAME/ KEY: CDS 

(B) LOCATION: 1. . 1956 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 4 1 : 

ATG AAT CCA AAC AAT CGA AGT GAA CAT GAT ACG ATA AAG GTT ACA CCT 4 8 

Met Asn Pro Asn Asn Arg Ser Glu His Asp Thr lie Lys Val Thr Pro 
15 10 15 

AAC AGT GAA TTG CAA ACT AAC CAT AAT CAA TAT CCT TTA GCT GAC AAT 96 
Asn Ser Glu Leu Gin Thr Asn His Asn Gin Tyr Pro Leu Ala Asp Asn 
20 25 30 

CCA AAT TCA ACA CTA GAA GAA TTA AAT TAT AAA GAA TTT TTA AGA ATG 144 
Pro Asn Ser Thr Leu Glu Glu Leu Asn Tyr Lys Glu Phe Leu Arg Met 
35 40 45 

ACT GAA GAC AGT TCT ACG GAA GTG CTA GAC AAC TCT ACA GTA AAA GAT 192 
Thr Glu Asp Ser Ser Thr Glu Val Leu Asp Asn Ser Thr Val Lys Asp 
50 55 60 

GCA GTT GGG ACA GGA ATT TCT GTT GTA GGG CAG ATT TTA GGT GTT GTA 240 
Ala Val Gly Thr Gly lie Ser Val Val Gly Gin He Leu Gly Val Val 
65 70 75 80 

GGA GTT CCA TTT GCT GGG GCA CTC ACT TCA TTT TAT CAA TCA TTT CTT 288 
Gly Val Pro Phe Ala Gly Ala Leu Thr Ser Phe Tyr Gin Ser Phe Leu 
85 90 95 

AAC ACT ATA TGG CCA AGT GAT GCT GAC CCA TGG AAG GCT TTT ATG GCA 3 36 

Asn Thr He Trp Pro Ser Asp Ala Asp Pro Trp Lys Ala Phe Met Ala 
100 105 110 

CAA GTT GAA GTA CTG ATA GAT AAG AAA ATA GAG GAG TAT GCT AAA AGT 384 
Gin Val Glu Val Leu He Asp Lys Lys He Glu Glu Tyr Ala Lys Ser 
115 120 125 

AAA GCT CTT GCA GAG TTA CAG GGT CTT CAA AAT AAT TTC GAA GAT TAT 4 32 

Lys Ala Leu Ala Glu Leu Gin Gly Leu Gin Asn Asn Phe Glu Asp Tyr 
130 135 140 

GTT AAT GCG TTA AAT TCC TGG AAG AAA ACA CCT TTA AGT TTG CGA AGT 48 0 

Val Asn Ala Leu Asn Ser Trp Lys Lys Thr Pro Leu Ser Leu Arg Ser 
145 150 155 160 

AAA AGA AGC CAA GAT CGA ATA AGG GAA CTT TTT TCT CAA GCA GAA AGT 528 
Lys Arg Ser Gin Asp Arg He Arg Glu Leu Phe Ser Gin Ala Glu Ser 
165 170 175 
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CAT TTT CGT AAT TCC ATG CCG TCA TTT GCA GTT TCC AAA TTC GAA GTG 
His Phe Arg Asn Ser Met Pro Ser Phe Ala Val Ser Lys Phe Glu Val 
180 185 190 

CTG TTT CTA CCA ACA TAT GCA CAA GCT GCA AAT ACA CAT TTA TTG CTA 
Leu Phe Leu Pro Thr Tyr Ala Gin Ala Ala Asn Thr His Leu Leu Leu 
195 200 205 

TTA AAA GAT GCT CAA GTT TTT GGA GAA GAA TGG GGA TAT TCT TCA GAA 
Leu Lys Asp Ala Gin Val Phe Gly Glu Glu Trp Gly Tyr Ser Ser Glu 
210 . • 215 220 

GAT GTT GCT GAA TTT TAT CAT AGA CAA TTA AAA CTT ACA CAA CAA TAC 
Asp Val Ala Glu Phe Tyr His Arg Gin Leu Lys Leu Thr Gin Gin Tyr 
225 230 235 240 

ACT GAC CAT TGT GTT AAT TGG TAT AAT GTT GGA TTA AAT GGT TTA AGA 
Thr Asp His Cys Val Asn Trp Tyr Asn Val Gly Leu Asn Gly Leu Arg 
245 250 255 

GGT TCA ACT TAT GAT GCA TGG GTC AAA TTT AAC CGT TTT CGC AGA GAA 
Gly Ser Thr Tyr Asp Ala Trp Val Lys Phe Asn Arg Phe Arg Arg Glu 
260 265 270 

ATG ACT TTA ACT GTA TTA GAT CTA ATT GTA CTT TTC CCA TTT TAT GAT 
Met Thr Leu Thr Val Leu Asp Leu lie Val Leu Phe Pro Phe Tyr Asp 
275 280 285 

ATT GTG TTA TAC TCA AAA GGG GTT AAA ACA GAA CTA ACA AGA GAC ATT 
lie Val Leu Tyr Ser Lys Gly Val Lys Thr Glu Leu Thr Arg Asp He 
290 295 300 

TTT ACG GAT CCA ATT TTT TCA CTT AAT ACT CTT CAG GAG TAT GGA CCA 
Phe Thr Asp Pro He Phe Ser Leu Asn Thr Leu Gin Glu Tyr Gly Pro 
305 310 315 320 

ACT TTT TTG AGT ATA GAA AAC TCT ATT CGA AAA CCT CAT TTA TTT GAT 
Thr Phe Leu Ser He Glu Asn Ser He Arg Lys Pro His Leu Phe Asp 
325 330 335 

TAT TTA CAG GGG ATT GAA TTT CAT ACG CGT CTT CAA CCT GGT TAC TTT 
Tyr Leu Gin Gly He Glu Phe His Thr Arg Leu Gin Pro Gly Tyr Phe 
340 345 350 

GGG AAA GAT TCT TTC AAT TAT TGG TCT GGT AAT TAT GTA GAA ACT AGA 
Gly Lys Asp Ser Phe Asn Tyr Trp Ser Gly Asn Tyr Val Glu Thr Arg 
355 360 365 

CCT AGT ATA GGA TCT AGT AAG ACA ATT ACT TCC CCA TTT TAT GGA GAT 
Pro Ser He Gly Ser Ser Lys Thr He Thr Ser Pro Phe Tyr Gly Asp 
370 375 380 
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AAA TCT ACT GAA CCT GTA CAA AAG CTA AGC TTT GAT GGA CAA AAA GTT 12 00 

Lys Ser Thr Glu Pro Val Gin Lys Leu Ser Phe Asp Gly Gin Lys Val 
385 390 395 400 

TAT CGA ACT ATA GCT AAT ACA GAC GTA GCG GCT TGG CCG AAT GGT AAG 124 8 

Tyr Arg Thr lie Ala Asn Thr Asp Val Ala Ala Trp Pro Asn Gly Lys 
405 410 415 

GTA TAT TTA GGT GTT ACG AAA GTT GAT TTT AGT CAA TAT GAT GAT CAA 12 96 

Val Tyr Leu Gly Val Thr Lys Val Asp Phe Ser Gin Tyr Asp Asp Gin 
420 425 430 

AAA AAT GAA ACT AGT ACA CAA ACA TAT GAT TCA AAA AGA AAC AAT GGC 1344 
Lys Asn Glu Thr Ser Thr Gin Thr Tyr Asp Ser Lys Arg Asn Asn Gly 
435 440 445 

CAT GTA AGT GCA CAG GAT TCT ATT GAC CAA TTA CCG CCA GAA ACA ACA 13 92 

His Val Ser Ala Gin Asp Ser lie Asp Gin Leu Pro Pro Glu Thr Thr 
450 455 460 

GAT GAA CCA CTT GAA AAA GCA TAT AGT CAT CAG CTT AAT TAC GCG GAA 144 0 

Asp Glu Pro Leu Glu Lys Ala Tyr Ser His Gin Leu Asn Tyr Ala Glu 
465 470 475 480 

TGT TTC TTA ATG CAG GAC CGT CGT GGA ACA ATT CCA TTT TTT ACT TGG 1488 
Cys Phe Leu Met Gin Asp Arg Arg Gly Thr lie Pro Phe Phe Thr Trp 
485 490 495 

ACA CAT AGA AGT GTA GAC TTT TTT AAT ACA ATT GAT GCT GAA AAG ATT 153 6 

Thr His Arg Ser Val Asp Phe Phe Asn Thr lie Asp Ala Glu Lys lie 
500 505 510 

ACT CAA CTT CCA GTA GTG AAA GCA TAT GCC TTG TCT TCA GGT GCT TCC 1584 
Thr Gin Leu Pro Val Val Lys Ala Tyr Ala Leu Ser Ser Gly Ala Ser 
515 520 525 

ATT ATT GAA GGT CCA GGA TTC ACA GGA GGA AAT TTA CTA TTC CTA AAA 1632 
lie lie Glu Gly Pro Gly Phe Thr Gly Gly Asn Leu Leu Phe Leu Lys 
530 535 540 

GAA TCT AGT AAT TCA ATT GCT AAA TTT AAA GTT ACA TTA AAT TCA GCA 16 80 

Glu Ser Ser Asn Ser lie Ala Lys Phe Lys Val Thr Leu Asn Ser Ala 
545 550 555 560 

GCC TTG TTA CAA CGA TAT CGT GTA AGA ATA CGC TAT GCT TCT ACC ACT 172 8 

Ala Leu Leu Gin Arg Tyr Arg Val Arg lie Arg Tyr Ala Ser Thr Thr 
565 570 575 

AAC TTA CGA CTT TTT GTG CAA AAT TCA AAC AAT GAT TTT CTT GTC ATC 17 76 

Asn Leu Arg Leu Phe Val Gin Asn Ser Asn Asn Asp Phe Leu Val lie 
580 585 590 
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TAC ATT AAT AAA ACT ATG AAT AAA GAT GAT GAT TTA ACA TAT CAA ACA 
Tyr lie Asn Lys Thr Met Asn Lys Asp Asp Asp Leu Thr Tyr Gin Thr 
595 600 605 

TTT GAT CTC GCA ACT ACT AAT TCT AAT ATG GGG TTC TCG GGT GAT AAG 
Phe Asp Leu Ala Thr Thr Asn Ser Asn Met Gly Phe Ser Gly Asp Lys 
610. 615 620 

AAT GAA CTT ATA ATA GGA GCA GAA TCT TTC GTT TCT AAT GAA AAA ATC 
Asn Glu Leu lie lie Gly Ala Glu Ser Phe Val Ser Asn Glu Lys lie 
625 630 635 640 

TAT ATA GAT AAG ATA GAA TTT ATC CCA GTA CAA TTG TAA 
Tyr lie Asp Lys He Glu Phe He Pro Val Gin Leu 
645 650 



(2) INFORMATION FOR SEQ ID NO: 42: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 6 52 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:42: 

Met Asn Pro Asn Asn Arg Ser Glu His Asp Thr He Lys Val Thr Pro 
15 10 15 

Asn Ser Glu Leu Gin Thr Asn His Asn Gin Tyr Pro Leu Ala Asp Asn 
20 25 30 

Pro Asn Ser Thr Leu Glu Glu Leu Asn Tyr Lys Glu Phe Leu Arg Met 
35 40 45 

Thr Glu Asp Ser Ser Thr Glu Val Leu Asp Asn Ser Thr Val Lys Asp 
50 55 60 

Ala Val Gly Thr Gly He Ser Val Val Gly Gin He Leu Gly Val Val 
65 70 75 80 

Gly Val Pro Phe Ala Gly Ala Leu Thr Ser Phe Tyr Gin Ser Phe Leu 
85 90 95 

Asn Thr lie Trp Pro Ser Asp Ala Asp Pro Trp Lys Ala Phe Met Ala 
100 105 110 

Gin Val Glu Val Leu He Asp Lys Lys He Glu Glu Tyr Ala Lys Ser 
115 120 125 

Lys Ala Leu Ala Glu Leu Gin Gly Leu Gin Asn Asn Phe Glu Asp Tyr 
130 135 140 
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Val Asn Ala Leu Asn Ser Trp Lys Lys Thr Pro Leu Ser Leu Arg Ser 
145 150 155 160 



Lys Arg Ser Gin Asp Arg lie Arg Glu Leu Phe Ser Gin Ala Glu Ser 
165 170 175 

His Phe Arg Asn Ser Met Pro Ser Phe Ala Val Ser Lys Phe Glu Val 
180 185 190 

Leu Phe Leu Pro Thr Tyr Ala Gin Ala Ala Asn Thr His Leu Leu Leu 
195 200 205 

Leu Lys Asp Ala Gin Val Phe Gly Glu Glu Trp Gly Tyr Ser Ser Glu 
210 215 220 

Asp Val Ala Glu Phe Tyr His Arg Gin Leu Lys Leu Thr Gin Gin Tyr 
225 230 235 240 

Thr Asp His Cys Val Asn Trp Tyr Asn Val Gly Leu Asn Gly Leu Arg 
245 250 255 

Gly Ser Thr Tyr Asp Ala Trp Val Lys Phe Asn Arg Phe Arg Arg Glu 
260 265 270 

Met Thr Leu Thr Val Leu Asp Leu lie Val Leu Phe Pro Phe Tyr Asp 
275 280 285 

lie Val Leu Tyr Ser Lys Gly Val Lys Thr Glu Leu Thr Arg Asp lie 
290 295 300 

Phe Thr Asp Pro lie Phe Ser Leu Asn Thr Leu Gin Glu Tyr Gly Pro 
305 310 315 320 

Thr Phe Leu Ser lie Glu Asn Ser lie Arg Lys Pro His Leu Phe Asp 
325 330 335 

Tyr Leu Gin Gly He Glu Phe His Thr Arg Leu Gin Pro Gly Tyr Phe 
340 345 350 

Gly Lys Asp Ser Phe Asn Tyr Trp Ser Gly Asn Tyr Val Glu Thr Arg 
355 360 365 

Pro Ser He Gly Ser Ser Lys Thr lie Thr Ser Pro Phe Tyr Gly Asp 
370 375 380 

Lys Ser Thr Glu Pro Val Gin Lys Leu Ser Phe Asp Gly Gin Lys Val 
385 390 395 400 

Tyr Arg Thr He Ala Asn Thr Asp Val Ala Ala Trp Pro Asn Gly Lys 
405 410 415 



Val Tyr Leu Gly Val Thr Lys Val Asp Phe Ser Gin Tyr Asp Asp Gin 
420 425 430 
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Lys Asn Glu Thr Ser Thr Gin Thr Tyr Asp Ser Lys Arg Asn Asn Gly 
435 440 445 



His Val Ser Ala Gin Asp Ser lie Asp Gin Leu Pro Pro Glu Thr Thr 
450 455 460 

Asp Glu Pro Leu Glu Lys Ala Tyr Ser His Gin Leu Asn Tyr Ala Glu 
465 470 475 480 

Cys Phe Leu Met Gin Asp Arg Arg Gly Thr lie Pro Phe Phe Thr Trp 
485 490 495 

Thr His Arg Ser Val Asp Phe Phe Asn Thr lie Asp Ala Glu Lys lie 
500 505 510 

Thr Gin Leu Pro Val Val Lys Ala Tyr Ala Leu Ser Ser Gly Ala Ser 
515 520 525 

lie lie Glu Gly Pro Gly Phe Thr Gly Gly Asn Leu Leu Phe Leu Lys 
530 535 540 

Glu Ser Ser Asn Ser lie Ala Lys Phe Lys Val Thr Leu Asn Ser Ala 
545 550 555 560 

Ala Leu Leu Gin Arg Tyr Arg Val Arg lie Arg Tyr Ala Ser Thr Thr 
565 570 575 

Asn Leu Arg Leu Phe Val Gin Asn Ser Asn Asn Asp Phe Leu Val lie 
580 585 590 

Tyr lie Asn Lys Thr Met Asn Lys Asp Asp Asp Leu Thr Tyr Gin Thr 
595 600 605 

Phe Asp Leu Ala Thr Thr Asn Ser Asn Met Gly Phe Ser Gly Asp Lys 
610 615 620 

Asn Glu Leu lie lie Gly Ala Glu Ser Phe Val Ser Asn Glu Lys lie 
625 630 635 640 

Tyr lie Asp Lys lie Glu Phe lie Pro Val Gin Leu 
645 650 



(2) INFORMATION FOR SEQ ID NO: 43: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1959 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 
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(ix) FEATURE: 

(A) NAME /KEY : CDS 

(B) LOCATION: 1 . . 1956 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO : 4 3 : 

ATG AAT CCA AAC AAT CGA AGT GAA CAT GAT ACG ATA AAG GTT ACA CCT 4 8 

Met Asn Pro Asn Asn Arg Ser Glu His Asp Thr lie Lys Val Thr Pro 
15 10 15 

AAC AGT GAA TTG CAA ACT AAC CAT AAT CAA TAT CCT TTA GCT GAC AAT 96 
Asn Ser* Glu Leu Gin Thr Asn His Asn Gin Tyr Pro Leu Ala Asp Asn 
20 25 30 

CCA AAT TCA ACA CTA GAA GAA TTA AAT TAT AAA GAA TTT TTA AGA ATG 14 4 

Pro Asn Ser Thr Leu Glu Glu Leu Asn Tyr Lys Glu Phe Leu Arg Met 
35 40 45 

ACT GAA GAC AGT TCT ACG GAA GTG CTA GAC AAC TCT ACA GTA AAA GAT 192 
Thr Glu Asp Ser Ser Thr Glu Val Leu Asp Asn Ser Thr Val Lys Asp 
50 55 60 

GCA GTT GGG ACA GGA ATT TCT GTT GTA GGG CAG ATT TTA GGT GTT GTA 24 0 

Ala Val Gly Thr Gly He Ser Val Val Gly Gin He Leu Gly Val Val 
65 70 75 80 

GGA GTT CCA TTT GCT GGG GCA CTC ACT TCA TTT TAT CAA TCA TTT CTT 288 
Gly Val Pro Phe Ala Gly Ala Leu Thr Ser Phe Tyr Gin Ser Phe Leu 
85 90 95 

AAC ACT ATA TGG CCA AGT GAT GCT GAC CCA TGG AAG GCT TTT ATG GCA 336 
Asn Thr He Trp Pro Ser Asp Ala Asp Pro Trp Lys Ala Phe Met Ala 
100 105 110 

CAA GTT GAA GTA CTG ATA GAT AAG AAA ATA GAG GAG TAT GCT AAA AGT 384 
Gin Val Glu Val Leu He Asp Lys Lys He Glu Glu Tyr Ala Lys Ser 
115 120 125 

AAA GCT CTT GCA GAG TTA CAG GGT CTT CAA AAT AAT TTC GAA GAT TAT 4 32 

Lys Ala Leu Ala Glu Leu Gin Gly Leu Gin Asn Asn Phe Glu Asp Tyr 
130 135 140 

GTT AAT GCG TTA AAT TCC TGG AAG AAA ACA CCT TTA AGT TTG CGA AGT 4 80 

Val Asn Ala Leu Asn Ser Trp Lys Lys Thr Pro Leu Ser Leu Arg Ser 
145 150 155 160 

AAA AGA AGC CAA GGT CGA ATA AGG GAA CTT TTT TCT CAA GCA GAA AGT 52 8 

Lys Arg Ser Gin Gly Arg He Arg Glu Leu Phe Ser Gin Ala Glu Ser 
165 170 175 

CAT TTT CGT AAT TCC ATG CCG TCA TTT GCA GTT TCC AAA TTC GAA GTG 5 76 

His Phe Arg Asn Ser Met Pro Ser Phe Ala Val Ser Lys Phe Glu Val 
180 185 190 
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CTG TTT CTA CCA ACA TAT GCA CAA GCT GCA AAT AC A CAT TTA TTG CTA 6 24 

Leu Phe Leu Pro Thr Tyr Ala Gin Ala Ala Asn Thr His Leu Leu Leu 
195 200 205 

TTA AAA GAT GCT CAA GTT TTT GGA GAA GAA TGG GGA TAT TCT TCA GAA 6 72 

Leu Lys Asp Ala Gin Val Phe Gly Glu Glu Trp Gly Tyr Ser Ser Glu 
210 215 220 

GAT GTT GCT GAA TTT TAT CAT AGA CAA TTA AAA CTT ACA CAA CAA TAC 720 
Asp Val Ala Glu Phe Tyr His Arg Gin Leu Lys Leu Thr Gin Gin Tyr 
225 230 235 240 

ACT GAC CAT TGT GTT AAT TGG TAT AAT GTT GGA TTA AAT GGT TTA AGA 76 8 

Thr Asp His Cys Val Asn Trp Tyr Asn Val Gly Leu Asn Gly Leu Arg 
245 250 255 

GGT TCA ACT TAT GAT GCA TGG GTC AAA TTT AAC CGT TTT CGC AGA GAA 816 
Gly Ser Thr Tyr Asp Ala Trp Val Lys Phe Asn Arg Phe Arg Arg Glu 
260 265 270 

ATG ACT TTA ACT GTA TTA GAT CTA ATT GTA CTT TTC CCA TTT TAT GAT 864 
Met Thr Leu Thr Val Leu Asp Leu lie Val Leu Phe Pro Phe Tyr Asp 
275 280 285 

ATT CGG TTA TAC TCA AAA GGG GTT AAA ACA GAA CTA ACA AGA GAC ATT 912 
lie Arg Leu Tyr Ser Lys Gly Val Lys Thr Glu Leu Thr Arg Asp lie 
290 295 300 

TTT ACG GAT CCA ATT TTT TCA CTT AAT ACT CTT CAG GAG TAT GGA CCA 960 
Phe Thr Asp Pro lie Phe Ser Leu Asn Thr Leu Gin Glu Tyr Gly Pro 
305 310 315 320 

ACT TTT TTG AGT ATA GAA AAC TCT ATT CGA AAA CCT CAT TTA TTT GAT 1008 
Thr Phe Leu Ser lie Glu Asn Ser lie Arg Lys Pro His Leu Phe Asp 
325 330 335 

TAT TTA CAG GGG ATT GAA TTT CAT ACG CGT CTT CAA CCT GGT TAC TTT 1056 
Tyr Leu Gin Gly He Glu Phe His Thr Arg Leu Gin Pro Gly Tyr Phe 
340 345 350 

GGG AAA GAT TCT TTC AAT TAT TGG TCT GGT AAT TAT GTA GAA ACT AGA 1104 
Gly Lys Asp Ser Phe Asn Tyr Trp Ser Gly Asn Tyr Val Glu Thr Arg 
355 . 360 365 

CCT AGT ATA GGA TCT AGT AAG ACA ATT ACT TCC CCA TTT TAT GGA GAT 1152 
Pro Ser lie Gly Ser Ser Lys Thr He Thr Ser Pro Phe Tyr Gly Asp 
370 375 380 

AAA TCT ACT GAA CCT GTA CAA AAG CTA AGC TTT GAT GGA CAA AAA GTT 1200 
Lys Ser Thr Glu Pro Val Gin Lys Leu Ser Phe Asp Gly Gin Lys Val 
385 390 395 400 
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TAT CGA ACT ATA GCT AAT ACA GAC GTA GCG GCT TGG CCG AAT GGT AAG 1248 
Tyr Arg Thr He Ala Asn Thr Asp Val Ala Ala Trp Pro Asn Gly Lys 
405 410 415 

GTA TAT TTA GGT GTT ACG AAA GTT GAT TTT AGT CAA TAT GAT GAT CAA 12 96 

Val Tyr Leu Gly Val Thr Lys Val Asp Phe Ser Gin Tyr Asp Asp Gin 
420 425 430 

AAA AAT GAA ACT AGT ACA CAA ACA TAT GAT TCA AAA AGA AAC AAT GGC 1344 
Lys Asn Glu Thr Ser Thr Gin Thr Tyr Asp Ser Lys Arg Asn Asn Gly 
435 440 445 

CAT GTA AGT GCA CAG GAT TCT ATT GAC CAA TTA CCG CCA GAA ACA ACA 13 92 

His Val Ser Ala Gin Asp Ser He Asp Gin Leu Pro Pro Glu Thr Thr 
450 455 460 

GAT GAA CCA CTT GAA AAA GCA TAT AGT CAT CAG CTT AAT TAC GCG GAA 1440 
Asp Glu Pro Leu Glu Lys Ala Tyr Ser His Gin Leu Asn Tyr Ala Glu 
465 470 475 480 

TGT TTC TTA ATG CAG GAC CGT CGT GGA ACA ATT CCA TTT TTT ACT TGG 148 8 

Cys Phe Leu Met Gin Asp Arg Arg Gly Thr He Pro Phe Phe Thr Trp 
485 490 495 

ACA CAT AGA AGT GTA GAC TTT TTT AAT ACA ATT GAT GCT GAA AAG ATT 1536 
Thr His Arg Ser Val Asp Phe Phe Asn Thr He Asp Ala Glu Lys lie 
500 505 510 

ACT CAA CTT CCA GTA GTG AAA GCA TAT GCC TTG TCT TCA GGT GCT TCC 1584 
Thr Gin Leu Pro Val Val Lys Ala Tyr Ala Leu Ser Ser Gly Ala Ser 
515 520 525 

ATT ATT GAA GGT CCA GGA TTC ACA GGA GGA AAT TTA CTA TTC CTA AAA 1632 
He He Glu Gly Pro Gly Phe Thr Gly Gly Asn Leu Leu Phe Leu Lys 
530 535 540 

GAA TCT AGT AAT TCA ATT GCT AAA TTT AAA GTT ACA TTA AAT TCA GCA 16 80 

Glu Ser Ser Asn Ser lie Ala Lys Phe Lys Val Thr Leu Asn Ser Ala 
545 550 555 560 

GCC TTG TTA CAA CGA TAT CGT GTA AGA ATA CGC TAT GCT TCT ACC ACT 172 8 

Ala Leu Leu Gin Arg Tyr Arg Val Arg He Arg Tyr Ala Ser Thr Thr 
565 570 575 

AAC TTA CGA CTT TTT GTG CAA AAT TCA AAC AAT GAT TTT CTT GTC ATC 17 76 

Asn Leu Arg Leu Phe Val Gin Asn Ser Asn Asn Asp Phe Leu Val He 
580 585 590 

TAC ATT AAT AAA ACT ATG AAT AAA GAT GAT GAT TTA ACA TAT CAA ACA 18 24 

Tyr lie Asn Lys Thr Met Asn Lys Asp Asp Asp Leu Thr Tyr Gin Thr 
595 600 605 
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TTT GAT CTC GCA ACT ACT AAT TCT AAT ATG GGG TTC TCG GGT GAT AAG 18 72 

Phe Asp Leu Ala Thr Thr Asn Ser Asn Met Gly Phe Ser Gly Asp Lys 
610 615 620 

AAT GAA CTT ATA ATA GGA GCA GAA TCT TTC GTT TCT AAT GAA AAA ATC 192 0 

Asn Glu Leu He He Gly Ala Glu Ser Phe Val Ser Asn Glu Lys He 
625 630 635 640 

TAT ATA GAT AAG ATA GAA TTT ATC CCA GTA CAA TTG TAA 195 9 

Tyr He Asp Lys He Glu Phe He Pro Val Gin Leu 
645 650 



(2) INFORMATION FOR SEQ ID NO: 44: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6 52 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 44: 

Met Asn Pro Asn Asn Arg Ser Glu His Asp Thr lie Lys Val Thr Pro 
15 10 15 

Asn Ser Glu Leu Gin Thr Asn His Asn Gin Tyr Pro Leu Ala Asp Asn 
20 25 30 

Pro Asn Ser Thr Leu Glu Glu Leu Asn Tyr Lys Glu Phe Leu Arg Met 
35 40 45 

Thr Glu Asp Ser Ser Thr Glu Val Leu Asp Asn Ser Thr Val Lys Asp 
50 55 60 

Ala Val Gly Thr Gly He Ser Val Val Gly Gin He Leu Gly Val Val 
65 70 75 80 

Gly Val Pro Phe Ala Gly Ala Leu Thr Ser Phe Tyr Gin Ser Phe Leu 
85 90 95 

Asn Thr He Trp Pro Ser Asp Ala Asp Pro Trp Lys Ala Phe Met Ala 
100 105 110 

Gin Val Glu Val Leu He Asp Lys Lys He Glu Glu Tyr Ala Lys Ser 
115 120 125 

Lys Ala Leu Ala Glu Leu Gin Gly Leu Gin Asn Asn Phe Glu Asp Tyr 
130 135 140 

Val Asn Ala Leu Asn Ser Trp Lys Lys Thr Pro Leu Ser Leu Arg Ser 
145 150 155 160 
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Lys Arg Ser Gin Gly Arg lie Arg Glu Leu Phe Ser Gin Ala Glu Ser 
165 170 175 



His Phe Arg Asn Ser Met Pro Ser Phe Ala Val Ser Lys Phe Glu Val 
180 185 190 

Leu Phe Leu Pro Thr Tyr Ala Gin Ala Ala Asn Thr His Leu Leu Leu 
195 200 205 

Leu Lys Asp Ala Gin Val Phe Gly Glu Glu Trp Gly Tyr Ser Ser Glu 
210 215 220 

Asp Val Ala Glu Phe Tyr His Arg Gin Leu Lys Leu Thr Gin Gin Tyr 
225 230 235 240 

Thr Asp His Cys Val Asn Trp Tyr Asn Val Gly Leu Asn Gly Leu Arg 
245 250 255 

Gly Ser Thr Tyr Asp Ala Trp Val Lys Phe Asn Arg Phe Arg Arg Glu 
260 265 270 

Met Thr Leu Thr Val Leu Asp Leu lie Val Leu Phe Pro Phe Tyr Asp 
275 280 285 

lie Arg Leu Tyr Ser Lys Gly Val Lys Thr Glu Leu Thr Arg Asp lie 
290 295 300 

Phe Thr Asp Pro lie Phe Ser Leu Asn Thr Leu Gin Glu Tyr Gly Pro 
305 310 315 320 

Thr Phe Leu Ser lie Glu Asn Ser lie Arg Lys Pro His Leu Phe Asp 
325 330 335 

Tyr Leu Gin Gly lie Glu Phe His Thr Arg Leu Gin Pro Gly Tyr Phe 
340 345 350 

Gly Lys Asp Ser Phe Asn Tyr Trp Ser Gly Asn Tyr Val Glu Thr Arg 
355 360 365 

Pro Ser lie Gly Ser Ser Lys Thr lie Thr Ser Pro Phe Tyr Gly Asp 
370 375 380 

Lys Ser Thr Glu Pro Val Gin Lys Leu Ser Phe Asp Gly Gin Lys Val 
385 390 395 400 

Tyr Arg Thr lie Ala Asn Thr Asp Val Ala Ala Trp Pro Asn Gly Lys 
405 410 415 

Val Tyr Leu Gly Val Thr Lys Val Asp Phe Ser Gin Tyr Asp Asp Gin 
420 425 430 

Lys Asn Glu Thr Ser Thr Gin Thr Tyr Asp Ser Lys Arg Asn Asn Gly 
435 440 445 
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His Val Ser Ala Gin Asp Ser lie Asp Gin Leu Pro Pro Glu Thr Thr 
450 455 460 



Asp Glu Pro Leu Glu Lys Ala Tyr Ser His Gin Leu Asn Tyr Ala Glu 
465 470 475 480 

Cys Phe Leu Met Gin Asp Arg Arg Gly Thr lie Pro Phe Phe Thr Trp 
485 490 495 

Thr His Arg Ser Val Asp Phe Phe Asn Thr lie Asp Ala Glu Lys lie 
500 505 510 

Thr Gin Leu Pro Val Val Lys Ala Tyr Ala Leu Ser Ser Gly Ala Ser 
515 520 525 

lie lie Glu Gly Pro Gly Phe Thr Gly Gly Asn Leu Leu Phe Leu Lys 
530 535 540 

Glu Ser Ser Asn Ser lie Ala Lys Phe Lys Val Thr Leu Asn Ser Ala 
545 550 555 560 

Ala Leu Leu Gin Arg Tyr Arg Val Arg lie Arg Tyr Ala Ser Thr Thr 
565 570 575 

Asn Leu Arg Leu Phe Val Gin Asn Ser Asn Asn Asp Phe Leu Val lie 
580 585 590 

Tyr lie Asn Lys Thr Met Asn Lys Asp Asp Asp Leu Thr Tyr Gin Thr 
595 600 605 

Phe Asp Leu Ala Thr Thr Asn Ser Asn Met Gly Phe Ser Gly Asp Lys 
610 615 620 

Asn Glu Leu He He Gly Ala Glu Ser Phe Val Ser Asn Glu Lys lie 
625 630 635 640 

Tyr He Asp Lys He Glu Phe He Pro Val Gin Leu 
645 650 



(2) INFORMATION FOR SEQ ID NO: 45: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1959 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 1. .1956 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 45: 



ATG AAT CCA AAC AAT CGA AGT GAA CAT GAT ACG ATA AAG GTT ACA CCT 4 8 

Met Asn Pro Asn Asn Arg Ser Glu His Asp Thr lie Lys Val Thr Pro 
15 10 15 

AAC AGT GAA TTG CAA ACT AAC CAT AAT CAA TAT CCT TTA GCT GAC AAT 96 
Asn Ser Glu Leu Gin Thr Asn His Asn Gin Tyr Pro Leu Ala Asp Asn 
20 25 30 

CCA AAT TCA ACA CTA GAA GAA TTA AAT TAT AAA GAA TTT TTA AGA ATG 14 4 

Pro Asn Ser Thr Leu Glu Glu Leu Asn Tyr Lys Glu Phe Leu Arg Met 
35 40 45 

ACT GAA GAC AGT TCT ACG GAA GTG CTA GAC AAC TCT ACA GTA AAA GAT 192 
Thr Glu Asp Ser Ser Thr Glu Val Leu Asp Asn Ser Thr Val Lys Asp 
50 55 60 

GCA GTT GGG ACA GGA ATT TCT GTT GTA GGG CAG ATT TTA GGT GTT GTA 24 0 

Ala Val Gly Thr Gly He Ser Val Val Gly Gin lie Leu Gly Val Val 
65 70 75 80 

GGA GTT CCA TTT GCT GGG GCA CTC ACT TCA TTT TAT CAA TCA TTT CTT 28 8 

Gly Val Pro Phe Ala Gly Ala Leu Thr Ser Phe Tyr Gin Ser Phe Leu 
85 90 95 

AAC ACT ATA TGG CCA AGT GAT GCT GAC CCA TGG AAG GCT TTT ATG GCA 336 
Asn Thr He Trp Pro Ser Asp Ala Asp Pro Trp Lys Ala Phe Met Ala 
100 105 110 

CAA GTT GAA GTA CTG ATA GAT AAG AAA ATA GAG GAG TAT GCT AAA AGT 384 
Gin Val Glu Val Leu He Asp Lys Lys He Glu Glu Tyr Ala Lys Ser 
115 ' 120 125 

AAA GCT CTT GCA GAG TTA CAG GGT CTT CAA AAT AAT TTC GAA GAT TAT 43 2 

Lys Ala Leu Ala Glu Leu Gin Gly Leu Gin Asn Asn Phe Glu Asp Tyr 
130 135 140 

GTT AAT GCG TTA AAT TCC TGG AAG AAA ACA CCT TTA AGT TTG CGA AAT 480 
Val Asn Ala Leu Asn Ser Trp Lys Lys Thr Pro Leu Ser Leu Arg Asn 
145 150 155 160 

CCA CAC AGC CAA GGT CGA ATA AGG GAA CTT TTT TCT CAA GCA GAA AGT 52 8 

Pro His Ser Gin Gly Arg lie Arg Glu Leu Phe Ser Gin Ala Glu Ser 
165 170 175 

CAT TTT CGT AAT TCC ATG CCG TCA TTT GCA GTT TCC AAA TTC GAA GTG 5 76 

His Phe Arg Asn Ser Met Pro Ser Phe Ala Val Ser Lys Phe Glu Val 
180 185 190 

CTG TTT CTA CCA ACA TAT GCA CAA GCT GCA AAT ACA CAT TTA TTG CTA 624 
Leu Phe Leu Pro Thr Tyr Ala Gin Ala Ala Asn Thr His Leu Leu Leu 
195 200 205 
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TTA AAA GAT GCT CAA GTT TTT GGA GAA GAA TGG GGA TAT TCT TCA GAA 6 72 

Leu Lys Asp Ala Gin Val Phe Gly Glu Glu Trp Gly Tyr Ser Ser Glu 
210 215 220 

GAT GTT GCT GAA TTT TAT CAT AGA CAA TTA AAA CTT ACA CAA CAA TAC 72 0 

Asp Val Ala Glu Phe Tyr His Arg Gin Leu Lys Leu Thr Gin Gin Tyr 
225 230 235 240 

ACT GAC CAT TGT GTT AAT TGG TAT AAT GTT GGA TTA AAT GGT TTA AGA 768 
Thr Asp His Cys Val Asn Trp Tyr Asn Val Gly Leu Asn Gly Leu Arg 
245 250 255 

GGT TCA ACT TAT GAT GCA TGG GTC AAA TTT AAC CGT TTT CGC AGA GAA 816 
Gly Ser Thr Tyr Asp Ala Trp Val Lys Phe Asn Arg Phe Arg Arg Glu 
260 265 270 

ATG ACT TTA ACT GTA TTA GAT CTA ATT GTA CTT TTC CCA TTT TAT GAT 864 
Met Thr Leu Thr Val Leu Asp Leu lie Val Leu Phe Pro Phe Tyr Asp 
275 280 285 

ATT CGG TTA TAC TCA AAA GGG GTT AAA ACA GAA CTA ACA AGA GAC ATT 912 
lie Arg Leu Tyr Ser Lys Gly Val Lys Thr Glu Leu Thr Arg Asp lie 
290 295 300 

TTT ACG GAT CCA ATT TTT TCA CTT AAT ACT CTT CAG GAG TAT GGA CCA 960 
Phe Thr Asp Pro lie Phe Ser Leu Asn Thr Leu Gin Glu Tyr Gly Pro 
305 310 315 320 

ACT TTT TTG AGT ATA GAA AAC TCT ATT CGA AAA CCT CAT TTA TTT GAT 1008 
Thr Phe Leu Ser lie Glu Asn Ser lie Arg Lys Pro His Leu Phe Asp 
325 330 335 

TAT TTA CAG GGG ATT GAA TTT CAT ACG CGT CTT CAA CCT GGT TAC TTT 1056 
Tyr Leu Gin Gly lie Glu Phe His Thr Arg Leu Gin Pro Gly Tyr Phe 
340 345 350 

GGG AAA GAT TCT TTC AAT TAT TGG TCT GGT AAT TAT GTA GAA ACT AGA 1104 
Gly Lys Asp Ser Phe Asn Tyr Trp Ser Gly Asn Tyr Val Glu Thr Arg 
355 360 365 

CCT AGT ATA GGA TCT AGT AAG ACA ATT ACT TCC CCA TTT TAT GGA GAT 1152 
Pro Ser lie Gly Ser Ser Lys Thr lie Thr Ser Pro Phe Tyr Gly Asp 
370 375 380 

AAA TCT ACT GAA CCT GTA CAA AAG CTA AGC TTT GAT GGA CAA AAA GTT 12 00 

Lys Ser Thr Glu Pro Val Gin Lys Leu Ser Phe Asp Gly Gin Lys Val 
385 390 395 400 

TAT CGA ACT ATA GCT AAT ACA GAC GTA GCG GCT TGG CCG AAT GGT AAG 1248 
Tyr Arg Thr He Ala Asn Thr Asp Val Ala Ala Trp Pro Asn Gly Lys 
405 410 415 
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GTA TAT TTA GGT GTT ACG AAA GTT GAT TTT AGT CAA TAT GAT GAT CAA 12 96 

Val Tyr Leu Gly Val Thr Lys Val Asp Phe Ser Gin Tyr Asp Asp Gin 
420 425 430 

AAA AAT GAA ACT AGT ACA CAA AC A TAT GAT TCA AAA AGA AAC AAT GGC 1344 
Lys Asn Glu Thr Ser Thr Gin Thr Tyr Asp Ser Lys Arg Asn Asn Gly 
435 440 445 

CAT GTA AGT GCA CAG GAT TCT ATT GAC CAA TTA CCG CCA GAA ACA ACA 13 92 

His Val Ser Ala Gin Asp Ser lie Asp Gin Leu Pro Pro Glu Thr Thr 
450 j 455 460 

GAT GAA CCA CTT GAA AAA GCA TAT AGT CAT CAG CTT AAT TAG GCG GAA 144 0 

Asp Glu Pro Leu Glu Lys Ala Tyr Ser His Gin Leu Asn Tyr Ala Glu 
465 470 475 480 

TGT TTC TTA ATG CAG GAC CGT CGT GGA ACA ATT CCA TTT TTT ACT TGG 14 88 

Cys Phe Leu Met Gin Asp Arg Arg Gly Thr lie Pro Phe Phe Thr Trp 
485 490 495 

ACA CAT AGA AGT GTA GAC TTT TTT AAT ACA ATT GAT GCT GAA AAG ATT 1536 
Thr His Arg Ser Val Asp Phe Phe Asn Thr lie Asp Ala Glu Lys lie 
500 505 510 

ACT CAA CTT CCA GTA GTG AAA GCA TAT GCC TTG TCT TCA GGT GCT TCC 1584 
Thr Gin Leu Pro Val Val Lys Ala Tyr Ala Leu Ser Ser Gly Ala Ser 
515 520 525 

ATT ATT GAA GGT CCA GGA TTC ACA GGA GGA AAT TTA CTA TTC CTA AAA 1632 
lie lie Glu Gly Pro Gly Phe Thr Gly Gly Asn Leu Leu Phe Leu Lys 
530 535 540 

GAA TCT AGT AAT TCA ATT GCT AAA TTT AAA GTT ACA TTA AAT TCA GCA 1680 
Glu Ser Ser Asn Ser lie Ala Lys Phe Lys Val Thr Leu Asn Ser Ala 
545 550 555 560 

GCC TTG TTA CAA CGA TAT CGT GTA AGA ATA CGC TAT GCT TCT ACC ACT 172 8 

Ala Leu Leu Gin Arg Tyr Arg Val Arg He Arg Tyr Ala Ser Thr Thr 
565 570 575 

AAC TTA CGA CTT TTT GTG CAA AAT TCA AAC AAT GAT TTT CTT GTC ATC 1776 
Asn Leu Arg Leu Phe Val Gin Asn Ser Asn Asn Asp Phe Leu Val He 
580 585 590 

TAC ATT AAT AAA ACT ATG AAT AAA GAT GAT GAT TTA ACA TAT CAA ACA 1824 
Tyr He Asn Lys Thr Met Asn Lys Asp Asp Asp Leu Thr Tyr Gin Thr 
595 600 605 

TTT GAT CTC GCA ACT ACT AAT TCT AAT ATG GGG TTC TCG GGT GAT AAG 1872 
Phe Asp Leu Ala Thr Thr Asn Ser Asn Met Gly Phe Ser Gly Asp Lys 
610 615 620 
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AAT GAA CTT ATA ATA GGA GCA GAA TCT TTC GTT TCT AAT GAA AAA ATC 
Asn Glu Leu lie lie Gly Ala Glu Ser Phe Val Ser Asn Glu Lys lie 
625 630 635 640 



1920 



TAT ATA GAT AAG ATA GAA TTT ATC CCA GTA CAA TTG TAA 1959 
Tyr lie Asp Lys lie Glu Phe lie Pro Val Gin Leu 
645 650 



(2) INFORMATION FOR SEQ ID NO:46: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6 52 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 46: 

Met Asn Pro Asn Asn Arg Ser Glu His Asp Thr lie Lys Val Thr Pro 
15 10 15 

Asn Ser Glu Leu Gin Thr Asn His Asn Gin Tyr Pro Leu Ala Asp Asn 
20 25 30 

Pro Asn Ser Thr Leu Glu Glu Leu Asn Tyr Lys Glu Phe Leu Arg Met 
35 40 45 

Thr Glu Asp Ser Ser Thr Glu Val Leu Asp Asn Ser Thr Val Lys Asp 
50 55 60 

Ala Val Gly Thr Gly He Ser Val Val Gly Gin He Leu Gly Val Val 
65 70 75 80 

Gly Val Pro Phe Ala Gly Ala Leu Thr Ser Phe Tyr Gin Ser Phe Leu 
85 90 95 

Asn Thr He Trp Pro Ser Asp Ala Asp Pro Trp Lys Ala Phe Met Ala 
100 105 110 

Gin Val Glu Val Leu He Asp Lys Lys He Glu Glu Tyr Ala Lys Ser 
115 120 125 

Lys Ala Leu Ala Glu Leu Gin Gly Leu Gin Asn Asn Phe Glu Asp Tyr 
130 135 140 

Val Asn Ala Leu Asn Ser Trp Lys Lys Thr Pro Leu Ser Leu Arg Asn 
145 150 155 160 

Pro His Ser Gin Gly Arg He Arg Glu Leu Phe Ser Gin Ala Glu Ser 
165 170 175 
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His Phe Arg 



Asn Ser Met Pro Ser Phe Ala Val Ser Lys Phe Glu Val 
180 185 190 



Leu Phe Leu Pro Thr Tyr Ala Gin Ala Ala Asn Thr His Leu Leu Leu 
195 • 200 205 

Leu Lys Asp Ala Gin Val Phe Gly Glu Glu Trp Gly Tyr Ser Ser Glu 
210 215 220 

Asp Val Ala Glu Phe Tyr His Arg Gin Leu Lys Leu Thr Gin Gin Tyr 
225 230 235 240 

Thr Asp His Cys Val Asn Trp Tyr Asn Val Gly Leu Asn Gly Leu Arg 
245 250 255 

'Gly Ser Thr Tyr Asp Ala Trp Val Lys Phe Asn Arg Phe Arg Arg Glu 
260 265 270 

Met Thr Leu Thr Val Leu Asp Leu lie Val Leu Phe Pro Phe Tyr Asp 
275 280 285 

lie Arg Leu Tyr Ser Lys Gly Val Lys Thr Glu Leu Thr Arg Asp lie 
290 295 300 

Phe Thr Asp Pro lie Phe Ser Leu Asn Thr Leu Gin Glu Tyr Gly Pro 
305 310 315 320 

Thr Phe Leu Ser lie Glu Asn Ser lie Arg Lys Pro His Leu Phe Asp 
325 330 335 

Tyr Leu Gin Gly lie Glu Phe His Thr Arg Leu Gin Pro Gly Tyr Phe 
340 345 350 

Gly Lys Asp Ser Phe Asn Tyr Trp Ser Gly Asn Tyr Val Glu Thr Arg 
355 360 365 

Pro Ser lie Gly Ser Ser Lys Thr lie Thr Ser Pro Phe Tyr Gly Asp 
370 375 380 

Lys Ser Thr Glu Pro Val Gin Lys Leu Ser Phe Asp Gly Gin Lys Val 
385 390 395 400 

Tyr Arg Thr lie Ala Asn Thr Asp Val Ala Ala Trp Pro Asn Gly Lys 
405 410 415 

Val Tyr Leu Gly Val Thr Lys Val Asp Phe Ser Gin Tyr Asp Asp Gin 
420 425 430 

Lys Asn Glu Thr Ser Thr Gin Thr Tyr Asp Ser Lys Arg Asn Asn Gly 
435 440 445 

His Val Ser Ala Gin Asp Ser lie Asp Gin Leu Pro Pro Glu Thr Thr 
450 455 460 
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Asp Glu Pro Leu Glu Lys Ala Tyr Ser His Gin Leu Asn Tyr Ala Glu 
465 470 . 475 480 



Cys Phe Leu Met Gin Asp Arg Arg Gly Thr lie Pro Phe Phe Thr Trp 
485 490 495 

Thr His Arg Ser Val Asp Phe Phe Asn Thr lie Asp Ala Glu Lys lie 
500 505 510 

Thr Gin Leu Pro Val Val Lys Ala Tyr Ala Leu Ser Ser Gly Ala Ser 
515 520 525 

lie lie Glu Gly Pro Gly Phe Thr Gly Gly Asn Leu Leu Phe Leu Lys 
530 535 540 

Glu Ser Ser Asn Ser lie Ala Lys Phe Lys Val Thr Leu Asn Ser Ala 
545 550 555 560 

Ala Leu Leu Gin Arg Tyr Arg Val Arg lie Arg Tyr Ala Ser Thr Thr 
565 570 575 

Asn Leu Arg Leu Phe Val Gin Asn Ser Asn Asn Asp Phe Leu Val lie 
580 585 590 

Tyr lie Asn Lys Thr Met Asn Lys Asp Asp Asp Leu Thr Tyr Gin Thr 
595 600 605 

Phe Asp Leu Ala Thr Thr Asn Ser Asn Met Gly Phe Ser Gly Asp Lys 
610 615 620 

Asn Glu Leu lie lie Gly Ala Glu Ser Phe Val Ser Asn Glu Lys lie 
625 630 635 640 

Tyr lie Asp Lys lie Glu Phe lie Pro Val Gin Leu 
645 650 



(2) INFORMATION FOR SEQ ID NO: 47: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1959 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ix) FEATURE: 

(A) NAME/KEY: .CDS 

(B) LOCATION: 1 . . 1956 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 47: 

ATG AAT CCA AAC AAT CGA AGT GAA CAT GAT ACG ATA AAG GTT ACA CCT 
Met Asn Pro Asn Asn Arg Ser Glu His Asp Thr lie Lys Val Thr Pro 
15 10 15 
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AAC AGT GAA TTG CAA ACT AAC CAT AAT CAA TAT CCT TTA GCT GAC AAT 96 
Asn Ser Glu Leu Gin Thr Asn His Asn Gin Tyr Pro Leu Ala Asp Asn 
20 25 30 

CCA AAT TCA ACA CTA GAA GAA TTA AAT TAT AAA GAA TTT TTA AG A ATG 144 
Pro Asn Ser Thr Leu Glu Glu Leu Asn Tyr Lys Glu Phe Leu Arg Met 
35 40 45 

ACT GAA GAC AGT TCT ■ ACG GAA GTG CTA GAC AAC TCT ACA GTA AAA GAT 192 
Thr Glu Asp Ser Ser Thr Glu Val Leu Asp Asn Ser Thr Val Lys Asp 
50 55 60 

GCA GTT GGG ACA GGA ATT TCT GTT GTA GGG CAG ATT TTA GGT GTT GTA 24 0 

Ala Val Gly Thr Gly He Ser Val Val Gly Gin He Leu Gly Val Val 
65 70 75 80 

GGA GTT CCA TTT GCT GGG GCA CTC ACT TCA TTT TAT CAA TCA TTT CTT 288 
Gly Val Pro Phe Ala Gly Ala Leu Thr Ser Phe Tyr Gin Ser Phe Leu 
85 90 95 

AAC ACT ATA TGG CCA AGT GAT GCT GAC CCA TGG AAG GCT TTT ATG GCA 3 36 

Asn Thr He Trp Pro Ser Asp Ala Asp Pro Trp Lys Ala Phe Met Ala 
100 105 110 

CAA GTT GAA GTA CTG ATA GAT AAG AAA ATA GAG GAG TAT GCT AAA AGT 3 84 

Gin Val Glu Val Leu He Asp Lys Lys He Glu Glu Tyr Ala Lys Ser 
115 120 125 

AAA GCT CTT GCA GAG TTA CAG GGT CTT CAA AAT AAT TTC GAA GAT TAT 4 32 

Lys Ala Leu Ala Glu Leu Gin Gly Leu Gin Asn Asn Phe Glu Asp Tyr 
130 135 140 

GTT AAT GCG TTA AAT TCC TGG AAG AAA ACA CCT TTA AGT TTG CGA AGT 48 0 

Val Asn Ala Leu Asn Ser Trp Lys Lys Thr Pro Leu Ser Leu Arg Ser 
145 150 155 160 

AAA AGA AGC CAA GAT CGA ATA AGG GAA CTT TTT TCT CAA GCA GAA AGT 52 8 

Lys Arg Ser Gin Asp Arg He Arg Glu Leu Phe Ser Gin Ala Glu Ser 
165 170 175 

CAT TTT CGT AAT TCC ATG CCG TCA TTT GCA GTT TCC AAA TTC GAA GTG 576 
His Phe Arg Asn Ser Met Pro Ser Phe Ala Val Ser Lys Phe Glu Val 
180 185 190 

CTG TTT CTA CCA ACA TAT GCA CAA GCT GCA AAT ACA CAT TTA TTG CTA 624 
Leu Phe Leu Pro Thr Tyr Ala Gin Ala Ala Asn Thr His Leu Leu Leu 
195 200 205 

TTA AAA GAT GCT CAA GTT TTT GGA GAA GAA TGG GGA TAT TCT TCA GAA 672 
Leu Lys Asp Ala Gin Val Phe Gly Glu Glu Trp Gly Tyr Ser Ser Glu 
210 215 220 
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GAT GTT GCT GAA TTT TAT CAT AGA CAA TTA AAA CTT ACA CAA CAA TAG 72 0 

Asp Val Ala Glu Phe Tyr His Arg Gin Leu Lys Leu Thr Gin Gin Tyr 
225 230 235 240 

ACT GAC CAT TGT GTT AAT TGG TAT AAT GTT GGA TTA AAT GGT TTA AGA 768 
Thr Asp His Cys Val Asn Trp Tyr Asn Val Gly Leu Asn Gly Leu Arg 
245 250 255 

GGT TCA ACT TAT GAT GCA TGG GTC AAA TTT AAC CGT TTT CGC AGA GAA 816 
Gly Ser Thr Tyr Asp Ala Trp Val Lys Phe Asn Arg Phe Arg Arg Glu 
260 265 270 

ATG ACT TTA ACT GTA TTA GAT CTA ATT GTA CTT TTC CCA TTT TAT GAT 864 
Met Thr Leu Thr Val Leu Asp Leu lie Val Leu Phe Pro Phe Tyr Asp 
275 280 285 

GTT CGG TTA TAC CCA AAA GGG GTT AAA ACA GAA CTA ACA AGA GAC ATT 912 
Val Arg Leu Tyr Pro Lys Gly Val Lys Thr Glu Leu Thr Arg Asp lie 
290 295 300 

TTT ACG GAT CCA ATT TTT TCA CTT AAT ACT CTT CAG GAG TAT GGA CCA 960 
Phe Thr Asp Pro lie Phe Ser Leu Asn Thr Leu Gin Glu Tyr Gly Pro 
305 310 315 320 

ACT TTT TTG AGT ATA GAA AAC TCT ATT CGA AAA CCT CAT TTA TTT GAT 1008 
Thr Phe Leu Ser lie Glu Asn Ser lie Arg Lys Pro His Leu Phe Asp 
325 330 335 

TAT TTA CAG GGG ATT GAA TTT CAT ACG CGT CTT CAA CCT GGT TAC TTT 1056 
Tyr Leu Gin Gly lie Glu Phe His Thr Arg Leu Gin Pro Gly Tyr Phe 
340 345 350 

GGG AAA GAT TCT TTC AAT TAT TGG TCT GGT AAT TAT GTA GAA ACT AGA 1104 
Gly Lys Asp Ser Phe Asn Tyr Trp Ser Gly Asn Tyr Val Glu Thr Arg 
355 360 365 

CCT AGT ATA GGA TCT AGT AAG ACA ATT ACT TCC CCA TTT TAT GGA GAT 1152 
Pro Ser He Gly Ser Ser Lys Thr He Thr Ser Pro Phe Tyr Gly Asp 
370 375 380 

AAA TCT ACT GAA CCT GTA CAA AAG CTA AGC TTT GAT GGA CAA AAA GTT 1200 
Lys Ser Thr Glu Pro Val Gin Lys Leu Ser Phe Asp Gly Gin Lys Val 
385 390 395 400 

TAT CGA ACT ATA GCT AAT ACA GAC GTA GCG GCT TGG CCG AAT GGT AAG 1248 
Tyr Arg Thr He Ala Asn Thr Asp Val Ala Ala Trp Pro Asn Gly Lys 
405 410 415 

GTA TAT TTA GGT GTT ACG AAA GTT GAT TTT AGT CAA TAT GAT GAT CAA 12 96 

Val Tyr Leu Gly Val Thr Lys Val Asp Phe Ser Gin Tyr Asp Asp Gin 
420 425 430 
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AAA AAT GAA ACT AGT AC A CAA ACA TAT GAT TCA AAA AG A AAC AAT GGC 134 4 

Lys Asn Glu Thr Ser Thr Gin Thr Tyr Asp Ser Lys Arg Asn Asn Gly 
435 440 445 

CAT GTA AGT GCA CAG GAT TCT ATT GAC CAA TTA CCG CCA GAA ACA ACA 13 92 

His Val Ser Ala Gin Asp Ser lie Asp Gin Leu Pro Pro Glu Thr Thr 
450 455 460 

GAT GAA CCA CTT GAA AAA GCA TAT AGT CAT CAG CTT AAT TAC GCG GAA 144 0 

Asp Glu Pro Leu Glu Lys Ala Tyr Ser His Gin Leu Asn Tyr Ala Glu 
465 470 475 480 

TGT TTC TTA ATG CAG GAC CGT CGT GGA ACA ATT CCA TTT TTT ACT TGG 1488 
Cys Phe Leu Met Gin Asp Arg Arg Gly Thr lie Pro Phe Phe Thr Trp 
485 490 495 

ACA CAT AGA AGT GTA GAC TTT TTT AAT ACA ATT GAT GCT GAA AAG ATT 1536 
Thr His Arg Ser Val Asp Phe Phe Asn Thr lie Asp Ala Glu Lys lie 
500 505 510 

ACT CAA CTT CCA GTA GTG AAA GCA TAT GCC TTG TCT TCA GGT GCT TCC 1584 
Thr Gin Leu Pro Val Val Lys Ala Tyr Ala Leu Ser Ser Gly Ala Ser 
515 520 525 

ATT ATT GAA GGT CCA GGA TTC ACA GGA GGA AAT TTA CTA TTC CTA AAA 1632 
He He Glu Gly Pro Gly Phe Thr Gly Gly Asn Leu Leu Phe Leu Lys 
530 535 540 

GAA TCT AGT AAT TCA ATT GCT AAA TTT AAA GTT ACA TTA AAT TCA GCA 1680 
Glu Ser Ser Asn Ser He Ala Lys Phe Lys Val Thr Leu Asn Ser Ala 
545 550 555 560 

GCC TTG TTA CAA CGA TAT CGT GTA AGA ATA CGC TAT GCT TCT ACC ACT 172 8 

Ala Leu Leu Gin Arg Tyr Arg Val Arg He Arg Tyr Ala Ser Thr Thr 
565 570 575 

AAC TTA CGA CTT TTT GTG CAA AAT TCA AAC AAT GAT TTT CTT GTC ATC 1776 
Asn Leu Arg Leu Phe Val Gin Asn Ser Asn Asn Asp Phe Leu Val He 
580 585 590 

TAC ATT AAT AAA ACT ATG AAT AAA GAT GAT GAT TTA ACA TAT CAA ACA 1824 
Tyr He Asn Lys Thr Met Asn Lys Asp Asp Asp Leu Thr Tyr Gin Thr 
595 600 605 

TTT GAT CTC GCA ACT ACT AAT TCT AAT ATG GGG TTC TCG GGT GAT AAG 1872 
Phe Asp Leu Ala Thr Thr Asn Ser Asn Met Gly Phe Ser Gly Asp Lys 
610 615 620 

AAT GAA CTT ATA ATA GGA GCA GAA TCT TTC GTT TCT AAT GAA AAA ATC 1920 
Asn Glu Leu He He Gly Ala Glu Ser Phe Val Ser Asn Glu Lys He 
625 630 635 640 
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TAT ATA GAT AAG ATA GAA TTT ATC CCA GTA CAA TTG TAA 
Tyr lie Asp Lys lie Glu Phe lie Pro Val Gin Leu 
645 650 



1959 



(2) INFORMATION FOR SEQ ID NO: 48: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6 52 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 48: 

Met Asn Pro Asn Asn Arg Ser Glu His Asp Thr He Lys Val Thr Pro 
15 10 15 

Asn Ser Glu Leu Gin Thr Asn His Asn Gin Tyr Pro Leu Ala Asp Asn 
20 25 30 

Pro Asn Ser Thr Leu Glu Glu Leu Asn Tyr Lys Glu Phe Leu Arg Met 
35 40 45 

Thr Glu Asp Ser Ser Thr Glu Val Leu Asp Asn Ser Thr Val Lys Asp 
50 55 60 

Ala Val Gly Thr Gly He Ser Val Val Gly Gin He Leu Gly Val Val 
65 70 75 80 

Gly Val Pro Phe Ala Gly Ala Leu Thr Ser Phe Tyr Gin Ser Phe Leu 
85 90 95 

Asn Thr He Trp Pro Ser Asp Ala Asp Pro Trp Lys Ala Phe Met Ala 
100 105 110 

Gin Val Glu Val Leu He Asp Lys Lys He Glu Glu Tyr Ala Lys Ser 
115 120 125 

Lys Ala Leu Ala Glu Leu Gin Gly Leu Gin Asn Asn Phe Glu Asp Tyr 
130 135 140 

Val Asn Ala Leu Asn Ser Trp Lys Lys Thr Pro Leu Ser Leu Arg Ser 
145 150 155 160 

Lys Arg Ser Gin Asp Arg He Arg Glu Leu Phe Ser Gin Ala Glu Ser 
165 170 175 

His Phe Arg Asn Ser Met Pro Ser Phe Ala Val Ser Lys Phe Glu Val 
180 185 190 

Leu Phe Leu Pro Thr Tyr Ala Gin Ala Ala Asn Thr His Leu Leu Leu 
195 200 205 
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Leu Lys Asp Ala Gin Val Phe Gly Glu Glu Trp Gly Tyr Ser Ser Glu 
210 215 220 



Asp Val Ala Glu Phe Tyr His Arg Gin Leu Lys Leu Thr Gin Gin Tyr 
225 230 235 240 

Thr Asp His Cys Val Asn Trp Tyr Asn Val Gly Leu Asn Gly Leu Arg 
245 250 255 

Gly Ser Thr Tyr Asp Ala Trp Val Lys Phe Asn Arg Phe Arg Arg Glu 
260 265 270 

Met Thr Leu Thr Val Leu Asp Leu lie Val Leu Phe Pro Phe Tyr Asp 
275 280 285 

Val Arg Leu Tyr Pro Lys Gly Val Lys Thr Glu Leu Thr Arg Asp lie 
290 295 300 

Phe Thr Asp Pro lie Phe Ser Leu Asn Thr Leu Gin Glu Tyr Gly Pro 
305 310 315 320 

Thr Phe Leu Ser lie Glu Asn Ser lie Arg Lys Pro His Leu Phe Asp 
325 330 335 

Tyr Leu Gin Gly lie Glu Phe His Thr Arg Leu Gin Pro Gly Tyr Phe 
340 345 350 

Gly Lys Asp Ser Phe Asn Tyr Trp Ser Gly Asn Tyr Val Glu Thr Arg 
355 360 365 

Pro Ser lie Gly Ser Ser Lys Thr lie Thr Ser Pro Phe Tyr Gly Asp 
370 375 380 

Lys Ser Thr Glu Pro Val Gin Lys Leu Ser Phe Asp Gly Gin Lys Val 
385 390 395 400 

Tyr Arg Thr lie Ala Asn Thr Asp Val Ala Ala Trp Pro Asn Gly Lys 
405 410 415 

Val Tyr Leu Gly Val Thr Lys Val Asp Phe Ser Gin Tyr Asp Asp Gin 
420 425 430 

Lys Asn Glu Thr Ser Thr Gin Thr Tyr Asp Ser Lys Arg Asn Asn Gly 
435 440 445 

His Val Ser Ala Gin Asp Ser lie Asp Gin Leu Pro Pro Glu Thr Thr 
450 455 460 

Asp Glu Pro Leu Glu Lys Ala Tyr Ser His Gin Leu Asn Tyr Ala Glu 
465 470 475 480 

Cys Phe Leu Met Gin Asp Arg Arg Gly Thr lie Pro Phe Phe Thr Trp 
485 490 495 
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Thr His Arg Ser 
500 

Thr Gin Leu Pro 
515 

He He Glu Gly 
530 

Glu Ser Ser Asn 
545 

Ala Leu Leu Gin 



Asn Leu Arg Leu 
580 

Tyr He Asn Lys 
595 

Phe Asp Leu Ala 
610 

Asn Glu Leu He 
625 

Tyr He Asp Lys 



Val Asp Phe Phe 



Val Val Lys Ala 

520 

Pro Gly Phe Thr 
535 

Ser He Ala Lys 
550 

Arg Tyr Arg Val 
565 

Phe Val Gin Asn 



Thr Met Asn Lys 
600 

Thr Thr Asn Ser 
615 

He Gly Ala Glu 
630 

He Glu Phe He 
645 



Asn Thr He Asp 
505 

Tyr Ala Leu Ser 



Gly Gly Asn Leu 
540 

Phe Lys Val Thr 
555 

Arg He Arg Tyr 
570 

Ser Asn Asn Asp 
585 

Asp Asp Asp Leu 



Asn Met Gly Phe 
620 

Ser Phe Val Ser 
635 

Pro Val Gin Leu 
650 



Ala Glu Lys He 
510 

Ser Gly Ala Ser 
525 

Leu Phe Leu Lys 



Leu Asn Ser Ala 
560 

Ala Ser Thr Thr 
575 

Phe Leu Val He 
590 

Thr Tyr Gin Thr 
605 

Ser Gly Asp Lys 



Asn Glu Lys He 
640 



(2) INFORMATION FOR SEQ ID NO: 49: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1959 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ix) FEATURE: 

(A) NAME /KEY : CDS 

(B) LOCATION: 1. .1956 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 49: 

ATG AAT CCA AAC AAT CGA AGT GAA CAT GAT ACG ATA AAG GTT ACA CCT 
Met Asn Pro Asn Asn Arg Ser Glu His Asp Thr He Lys Val Thr Pro 
15 10 15 

AAC AGT GAA TTG CAA ACT AAC CAT AAT CAA TAT CCT TTA GCT GAC AAT 
Asn Ser Glu Leu Gin Thr Asn His Asn Gin Tyr Pro Leu Ala Asp Asn 
20 25 30 
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CCA AAT TCA ACA CTA GAA GAA TTA AAT TAT AAA GAA TTT TTA AGA ATG 144 
Pro Asn Ser Thr Leu Glu Glu Leu Asn Tyr Lys Glu Phe Leu Arg Met 
35 40 45 

ACT GAA GAC AGT TCT ACG GAA GTG CTA GAC AAC TCT ACA GTA AAA GAT 192 
Thr Glu Asp Ser Ser Thr Glu Val Leu Asp Asn Ser Thr Val Lys Asp 
50 55 60 

GCA GTT GGG ACA GGA ATT TCT GTT GTA GGG CAG ATT TTA GGT GTT GTA 24 0 

Ala Val Gly Thr Gly He Ser Val Val Gly Gin He Leu Gly Val Val 
65 70 75 80 

GGA GTT CCA TTT GCT GGG GCA CTC ACT TCA TTT TAT CAA TCA TTT CTT 288 
Gly Val Pro Phe Ala Gly Ala Leu Thr Ser Phe Tyr Gin Ser Phe Leu 
85 90 95 

AAC ACT ATA TGG CCA AGT GAT GCT GAC CCA TGG AAG GCT TTT ATG GCA 3 36 

' Asn Thr He Trp Pro Ser Asp Ala Asp Pro Trp Lys Ala Phe Met Ala 
100 105 110 

CAA GTT GAA GTA CTG ATA GAT AAG AAA ATA GAG GAG TAT GCT AAA AGT 384 
Gin Val Glu Val Leu He Asp Lys Lys He Glu Glu Tyr Ala Lys Ser 
115 120 125 

AAA GCT CTT GCA GAG TTA CAG GGT CTT CAA AAT AAT TTC GAA GAT TAT 432 
Lys Ala Leu Ala Glu Leu Gin Gly Leu Gin Asn Asn Phe Glu Asp Tyr 
130 135 140 

GTT AAT GCG TTA AAT TCC TGG AAG AAA ACA CCT TTA AGT TTG CGA AAT 480 
Val Asn Ala Leu Asn Ser Trp Lys Lys Thr Pro Leu Ser Leu Arg Asn 
145 150 155 160 

CCA CAC AGC CAA GGT CGA ATA AGG GAA CTT TTT TCT CAA GCA GAA AGT 528 
Pro His Ser Gin Gly Arg He Arg Glu Leu Phe Ser Gin Ala Glu Ser 
165 170 175 

CAT TTT CGT AAT TCC ATG CCG TCA TTT GCA GTT TCC AAA TTC GAA GTG 576 
His Phe Arg Asn Ser Met Pro Ser Phe Ala Val Ser Lys Phe Glu Val 
180 185 190 

CTG TTT CTA CCA ACA TAT GCA CAA GCT GCA AAT ACA CAT TTA TTG CTA 624 
Leu Phe Leu Pro Thr Tyr Ala Gin Ala Ala Asn Thr His Leu Leu Leu 
195 200 205 

TTA AAA GAT GCT CAA GTT TTT GGA GAA GAA TGG GGA TAT TCT TCA GAA 6 72 

Leu Lys Asp Ala Gin Val Phe Gly Glu Glu Trp Gly Tyr Ser Ser Glu 
210 215 220 

GAT GTT GCT GAA TTT TAT CAT AGA CAA TTA AAA CTT ACA CAA CAA TAC 720 
Asp Val Ala Glu Phe Tyr His Arg Gin Leu Lys Leu Thr Gin Gin Tyr 
225 230 235 240 
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ACT GAC CAT TGT GTT AAT TGG TAT AAT GTT GGA TTA AAT GGT TTA AGA 768 
Thr Asp His Cys Val Asn Trp Tyr Asn Val Gly Leu Asn Gly Leu Arg 
245 250 255 

GGT TCA ACT TAT GAT GCA TGG GTC AAA TTT AAC CGT TTT CGC AGA GAA 816 
Gly Ser Thr Tyr Asp Ala Trp Val Lys Phe Asn Arg Phe Arg Arg Glu 
260 265 270 

ATG ACT TTA ACT GTA TTA GAT CTA ATT GTA CTT TTC CCA TTT TAT GAT 864 
Met Thr Leu Thr Val Leu Asp Leu lie Val Leu Phe Pro Phe Tyr Asp 
275 280 285 

GTT CGG TTA TAC CCA AAA GGG GTT AAA ACA GAA CTA ACA AGA GAC ATT 912 
Val Arg Leu Tyr Pro Lys Gly Val Lys Thr Glu Leu Thr Arg Asp lie 
290 295 300 

TTT ACG GAT CCA ATT TTT TCA CTT AAT ACT CTT CAG GAG TAT GGA CCA 960 
Phe Thr Asp Pro lie Phe Ser Leu Asn Thr Leu Gin Glu Tyr Gly Pro 
305 310 315 320 

ACT TTT TTG AGT ATA GAA AAC TCT ATT CGA AAA CCT CAT TTA TTT GAT 1008 
Thr Phe Leu Ser lie Glu Asn Ser lie Arg Lys Pro His Leu Phe Asp 
325 330 335 

TAT TTA CAG GGG ATT GAA TTT CAT ACG CGT CTT CAA CCT GGT TAC TTT 1056 
Tyr Leu Gin Gly lie Glu Phe His Thr Arg Leu Gin Pro Gly Tyr Phe 
340 345 350 

GGG AAA 0 GAT TCT TTC AAT TAT TGG TCT GGT AAT TAT GTA GAA ACT AGA 1104 
Gly Lys Asp Ser Phe Asn Tyr Trp Ser Gly Asn Tyr Val Glu Thr Arg 
355 360 365 

CCT AGT ATA GGA TCT AGT AAG ACA ATT ACT TCC CCA TTT TAT GGA GAT 1152 
Pro Ser lie Gly Ser Ser Lys Thr He Thr Ser Pro Phe Tyr Gly Asp 
370 375 380 

AAA TCT ACT GAA CCT GTA CAA AAG CTA AGC TTT GAT GGA CAA AAA GTT 12 00 

Lys Ser Thr Glu Pro Val Gin Lys Leu Ser Phe Asp Gly Gin Lys Val 
385 390 395 400 

TAT CGA ACT ATA GCT AAT ACA GAC GTA GCG GCT TGG CCG AAT GGT AAG 1248 
Tyr Arg Thr lie Ala Asn Thr Asp Val Ala Ala Trp Pro Asn Gly Lys 
405 410 415 

GTA TAT TTA GGT GTT ACG AAA GTT GAT TTT AGT CAA TAT GAT GAT CAA 12 96 

Val Tyr Leu Gly Val Thr Lys Val Asp Phe Ser Gin Tyr Asp Asp Gin 
420 425 430 

AAA AAT GAA ACT AGT ACA CAA ACA TAT GAT TCA AAA AGA AAC AAT GGC 1344 
Lys Asn Glu Thr Ser Thr Gin Thr Tyr Asp Ser Lys Arg Asn Asn Gly 
435 440 445 
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CAT GTA AGT GCA CAG GAT TCT ATT GAC CAA TTA CCG CGA GAA AC A ACA 13 92 

His Val Ser Ala Gin Asp Ser lie Asp Gin Leu Pro Pro Glu Thr Thr 
450 455 460 

GAT GAA CCA CTT GAA AAA GCA TAT AGT CAT CAG CTT AAT TAC GCG GAA 1440 
Asp Glu Pro Leu Glu Lys Ala Tyr Ser His Gin Leu Asn Tyr Ala Glu 
465 470 475 480 

TGT TTC TTA ATG CAG GAC CGT CGT GGA ACA ATT CCA TTT TTT ACT TGG 1488 
Cys Phe Leu Met Gin Asp Arg Arg Gly Thr lie Pro Phe Phe Thr Trp 
485 490 495 

ACA CAT AGA AGT GTA GAC TTT TTT AAT ACA ATT GAT GCT GAA AAG ATT 153 6 

Thr His Arg Ser Val Asp Phe Phe Asn Thr He Asp Ala Glu Lys He 
500 505 510 

ACT CAA CTT CCA GTA GTG AAA GCA TAT GCC TTG TCT TCA GGT GCT TCC 1584 
Thr Gin Leu Pro Val Val Lys Ala Tyr Ala Leu Ser Ser Gly Ala Ser 
515 520 525 

ATT ATT GAA GGT CCA GGA TTC ACA GGA GGA AAT TTA CTA TTC CTA AAA 16 32 

lie lie Glu Gly Pro Gly Phe Thr Gly Gly Asn Leu Leu Phe Leu Lys 
530 535 540 

GAA TCT AGT AAT TCA ATT GCT AAA TTT AAA GTT ACA TTA AAT TCA GCA 16 80 

Glu Ser Ser Asn Ser He Ala Lys Phe Lys Val Thr Leu Asn Ser Ala 
545 550 555 560 

GCC TTG TTA CAA CGA TAT CGT GTA AGA ATA CGC TAT GCT TCT ACC ACT 1728 
Ala Leu Leu Gin Arg Tyr Arg Val Arg He Arg Tyr Ala Ser Thr Thr 
565 570 575 

AAC TTA CGA CTT TTT GTG CAA AAT TCA AAC AAT GAT TTT CTT GTC ATC 17 76 

Asn Leu Arg Leu Phe Val Gin Asn Ser Asn Asn Asp Phe Leu Val He 
580 585 590 

TAC ATT AAT AAA ACT ATG AAT AAA GAT GAT GAT TTA ACA TAT CAA ACA 1824 
Tyr He Asn Lys Thr Met Asn Lys Asp Asp Asp Leu Thr Tyr Gin Thr 
595 600 605 

TTT GAT CTC GCA ACT ACT AAT TCT AAT ATG GGG TTC TCG GGT GAT AAG 18 72 

Phe Asp Leu Ala Thr Thr Asn Ser Asn Met Gly Phe Ser Gly Asp Lys 
610 615 620 

AAT GAA CTT ATA ATA GGA GCA GAA TCT TTC GTT TCT AAT GAA AAA ATC 1920 
Asn Glu Leu He He Gly Ala Glu Ser Phe Val Ser Asn Glu Lys lie 
625 630 635 640 

TAT ATA GAT AAG ATA GAA TTT ATC CCA GTA CAA TTG TAA 1959 
Tyr He Asp Lys He Glu Phe He Pro Val Gin Leu 
645 650 
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(2) INFORMATION FOR SEQ ID NO:50: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6 52 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 50: 

Met Asn Pro Asn Asn Arg Ser Glu His Asp Thr lie Lys Val Thr Pro 
15 10 15 

Asn Ser Glu Leu Gin Thr Asn His Asn Gin Tyr Pro Leu Ala Asp Asn 
20 25 30 

Pro Asn Ser Thr Leu Glu Glu Leu Asn Tyr Lys Glu Phe Leu Arg Met 
35 40 45 

Thr Glu Asp Ser Ser Thr Glu Val Leu Asp Asn Ser Thr Val Lys Asp 
50 55 60 

Ala Val Gly Thr Gly He Ser Val Val Gly Gin He Leu Gly Val Val 
65 70 75 80 

Gly Val Pro Phe Ala Gly Ala Leu Thr Ser Phe Tyr Gin Ser Phe Leu 
85 90 95 

Asn Thr He Trp Pro Ser Asp Ala Asp Pro Trp Lys Ala Phe Met Ala 
100 105 110 

Gin Val Glu Val Leu He Asp Lys Lys He Glu Glu Tyr Ala Lys Ser 
115 120 125 

Lys Ala Leu Ala Glu Leu Gin Gly Leu Gin Asn Asn Phe Glu Asp Tyr 
130 135 140 

Val Asn Ala Leu Asn Ser Trp Lys Lys Thr Pro Leu Ser Leu Arg Asn 
145 150 155 160 

Pro His Ser Gin Gly Arg He Arg Glu Leu Phe Ser Gin Ala Glu Ser 
165 170 175 

His Phe Arg Asn Ser Met Pro Ser Phe Ala Val Ser Lys Phe Glu Val 
180 185 190 

Leu Phe Leu Pro Thr Tyr Ala Gin Ala Ala Asn Thr His Leu Leu Leu 
195 200 205 

Leu Lys Asp Ala Gin Val Phe Gly Glu Glu Trp Gly Tyr Ser Ser Glu 
210 215 220 
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Asp Val Ala Glu Phe Tyr His Arg Gin Leu Lys Leu Thr Gin Gin Tyr 
225 230 235 240 



Thr Asp His Cys Val Asn Trp Tyr Asn Val Gly Leu Asn Gly Leu Arg 
245 250 255 

Gly Ser Thr Tyr Asp Ala Trp Val Lys Phe Asn Arg Phe Arg Arg Glu 
260 265 270 

Met Thr Leu Thr Val Leu Asp Leu lie Val Leu Phe Pro Phe Tyr Asp 
275 280 285 

Val Arg Leu Tyr Pro Lys Gly Val Lys Thr Glu Leu Thr Arg Asp lie 
290 295 300 

Phe Thr Asp Pro lie Phe Ser Leu Asn Thr Leu Gin Glu Tyr Gly Pro 
305 310 315 320 

Thr Phe Leu Ser lie Glu Asn Ser lie Arg Lys Pro His Leu Phe Asp 
325 330 335 

Tyr Leu Gin Gly lie Glu Phe His Thr Arg Leu Gin Pro Gly Tyr Phe 
340 345 350 

Gly Lys Asp Ser Phe Asn Tyr Trp Ser Gly Asn Tyr Val Glu Thr Arg 
355 360 365 

Pro Ser lie Gly Ser Ser Lys Thr lie Thr Ser Pro Phe Tyr Gly Asp 
370 375 380 

Lys Ser Thr Glu Pro Val Gin Lys Leu Ser Phe Asp Gly Gin Lys Val 
385 390 395 400 

Tyr Arg Thr lie Ala Asn Thr Asp Val Ala Ala Trp Pro Asn Gly Lys 
405 410 415 

Val Tyr Leu Gly Val Thr Lys Val Asp Phe Ser Gin Tyr Asp Asp Gin 
420 425 430 

Lys Asn Glu Thr Ser Thr Gin Thr Tyr Asp Ser Lys Arg Asn Asn Gly 
435 440 445 

His Val Ser Ala Gin Asp Ser lie Asp Gin Leu Pro Pro Glu Thr Thr 
450 455 460 

Asp Glu Pro Leu Glu Lys Ala Tyr Ser His Gin Leu Asn Tyr Ala Glu 
465 470 475 480 

Cys Phe Leu Met Gin Asp Arg Arg Gly Thr lie Pro Phe Phe Thr Trp 
* 485 490 495 

Thr His Arg Ser Val Asp Phe Phe Asn Thr lie Asp Ala Glu Lys lie 
500 505 510 
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Thr Gin Leu Pro 
515 

lie He Glu Gly 
530 

Glu Ser Ser Asn 
545 

Ala Leu Leu Gin 



Asn Leu Arg Leu 
580 

Tyr He Asn Lys 
595 

Phe Asp Leu Ala 
610 

Asn Glu Leu He 
625 

Tyr He Asp Lys 



Val Val Lys Ala 
520 

Pro Gly Phe Thr 
535 

Ser He Ala Lys 
550 

Arg Tyr Arg Val 
565 

Phe Val Gin Asn 



Thr Met Asn Lys 
600 

Thr Thr Asn Ser 
615 

He Gly Ala Glu 
630 

He Glu Phe He 
645 



Tyr Ala Leu Ser 



Gly Gly Asn Leu 
540 

Phe Lys Val Thr 
555 

Arg He Arg Tyr 
570 

Ser Asn Asn Asp 
585 

Asp Asp Asp Leu 



Asn Met Gly Phe 
620 

Ser Phe Val Ser 
635 

Pro Val Gin Leu 
650 



Ser Gly Ala Ser 
525 

Leu Phe Leu Lys 



Leu Asn Ser Ala 
560 

Ala Ser Thr Thr 
575 

Phe Leu Val He 
590 

Thr Tyr Gin Thr 
605 

Ser Gly Asp Lys 



Asn Glu Lys He 
640 



(2) INFORMATION FOR SEQ ID NO: 51: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1956 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ix) FEATURE: 

(A) NAME /KEY : CDS 

(B) LOCATION: 1. .1953 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 51: 

ATG AAT CCA AAC AAT CGA AGT GAA CAT GAT ACG ATA AAG GTT ACA CCT 48 
Met Asn Pro Asn Asn Arg Ser Glu His Asp Thr He Lys Val Thr Pro 
15 10 15 

AAC AGT GAA TTG CAA ACT AAC CAT AAT CAA TAT CCT TTA GCT GAC AAT 96 
Asn Ser Glu Leu Gin Thr Asn His Asn Gin Tyr Pro Leu Ala Asp Asn 
20 25 30 

CCA AAT TCA ACA CTA GAA GAA TTA AAT TAT AAA GAA TTT TTA AGA ATG 144 
Pro Asn Ser Thr Leu Glu Glu Leu Asn Tyr Lys Glu Phe Leu Arg Met 
35 40 45 
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ACT GAA GAC AGT TCT ACG GAA GTG CTA GAC AAC TCT ACA GTA AAA GAT 192 
Thr Glu Asp Ser Ser Thr Glu Val Leu Asp Asn Ser Thr Val Lys Asp 
50 55 60 

GCA GTT GGG ACA GGA ATT TCT GTT GTA GGG CAG ATT TTA GGT GTT GTA 24 0 

Ala Val Gly Thr Gly He Ser Val Val Gly Gin He Leu Gly Val Val 
65 70 75 80 

GGA GTT CCA TTT GCT GGG GCA CTC ACT TCA TTT TAT CAA TCA TTT CTT 288 
Gly Val Pro Phe Ala Gly Ala Leu Thr Ser Phe Tyr Gin Ser Phe Leu 
85 90 95 

AAC ACT ATA TGG CCA AGT GAA GAC CCA TGG AAG GCT TTT ATG GCA CAA 3 36 

Asn Thr He Trp Pro Ser Glu Asp Pro Trp Lys Ala Phe Met Ala Gin 
100 105 110 

GTT GAA GTA CTG ATA GAT AAG AAA ATA GAG GAG TAT GCT AAA AGT AAA 3 84 

Val Glu Val Leu He Asp Lys Lys lie Glu Glu Tyr Ala Lys Ser Lys 
115 120 125 

GCT CTT GCA GAG TTA CAG GGT CTT CAA AAT AAT TTC GAA GAT TAT GTT 4 32 

Ala Leu Ala Glu Leu Gin Gly Leu Gin Asn Asn Phe Glu Asp Tyr Val 
130 135 140 

AAT GCG TTA AAT TCC TGG AAG AAA ACA CCT TTA AGT TTG CGA AGT AAA 480 
Asn Ala Leu Asn Ser Trp Lys Lys Thr Pro Leu Ser Leu Arg Ser Lys 
145 150 155 160 

AGA AGC CAA GAT CGA ATA AGG GAA CTT TTT TCT CAA GCA GAA AGT CAT 528 
Arg Ser Gin Asp Arg He Arg Glu Leu Phe Ser Gin Ala Glu Ser His 
165 170 175 

TTT CGT AAT TCC ATG CCG TCA TTT GCA GTT TCC AAA TTC GAA GTG CTG 5 76 

Phe Arg Asn Ser Met Pro Ser Phe Ala Val Ser Lys Phe Glu Val Leu 
180 185 190 

TTT CTA CCA ACA TAT GCA CAA GCT GCA AAT ACA CAT TTA TTG CTA TTA 624 
Phe Leu Pro Thr Tyr Ala Gin Ala Ala Asn Thr His Leu Leu Leu Leu 
195 , 200 205 

AAA GAT GCT CAA GTT TTT GGA GAA GAA TGG GGA TAT TCT TCA GAA GAT 672 
Lys Asp Ala Gin Val Phe Gly Glu Glu Trp Gly Tyr Ser Ser Glu Asp 
210 215 220 

GTT GCT GAA TTT TAT CAT AGA CAA TTA AAA CTT ACA CAA CAA TAC ACT 720 
Val Ala Glu Phe Tyr His Arg Gin Leu Lys Leu Thr Gin Gin Tyr Thr 
225 230 235 240 

GAC CAT TGT GTT AAT TGG TAT AAT GTT GGA TTA AAT GGT TTA AGA GGT 76 8 

Asp His Cys Val Asn Trp Tyr Asn Val Gly Leu Asn Gly Leu Arg Gly 
245 250 255 
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TCA ACT TAT GAT GCA TGG GTC AAA TTT AAC CGT TTT CGC AGA GAA ATG 316 
Ser Thr Tyr Asp Ala Trp Val Lys Phe Asn Arg Phe Arg Arg Glu Met 
260 265 270 

ACT TTA ACT GTA TTA GAT CTA ATT GTA CTT TTC CCA TTT TAT GAT ATT 86 4 

Thr Leu Thr Val Leu Asp Leu lie Val Leu Phe Pro Phe Tyr Asp lie 
275 280 285 

CGG TTA TAG TCA AAA GGG GTT AAA ACA GAA CTA ACA AGA GAC ATT TTT 912 
Arg Leu Tyr Ser Lys Gly Val Lys Thr Glu Leu Thr Arg Asp lie Phe 
290. 295 300 

ACG GAT CCA ATT TTT TCA CTT AAT ACT CTT CAG GAG TAT GGA CCA ACT 960 
Thr Asp Pro He Phe Ser Leu Asn Thr Leu Gin Glu Tyr Gly Pro Thr 
305 310 315 320 

TTT TTG AGT ATA GAA AAC TCT ATT CGA AAA CCT CAT TTA TTT GAT TAT 1008 
Phe Leu Ser He Glu Asn Ser He Arg Lys Pro His Leu Phe Asp Tyr 
325 330 335 

TTA CAG GGG ATT GAA TTT CAT ACG CGT CTT CAA CCT GGT TAC TTT GGG 1056 
Leu Gin Gly He Glu Phe His Thr Arg Leu Gin Pro Gly Tyr Phe Gly 
340 345 350 

AAA GAT TCT TTC AAT TAT TGG TCT GGT AAT TAT GTA GAA ACT AGA CCT 1104 
Lys Asp Ser Phe Asn Tyr Trp Ser Gly Asn Tyr Val Glu Thr Arg Pro 
355 360 365 

AGT ATA GGA TCT AGT AAG ACA ATT ACT TCC CCA TTT TAT GGA GAT AAA 1152 
Ser He Gly Ser Ser Lys Thr He Thr Ser Pro Phe Tyr Gly Asp Lys 
370 375 380 

TCT ACT GAA CCT GTA CAA AAG CTA AGC TTT GAT GGA CAA AAA GTT TAT 1200 
Ser Thr Glu Pro Val Gin Lys Leu Ser Phe Asp Gly Gin Lys Val Tyr 
385 390 395 400 

CGA ACT ATA GCT AAT ACA GAC GTA GCG GCT TGG CCG AAT GGT AAG GTA 124 8 

Arg Thr He Ala Asn Thr Asp Val Ala Ala Trp Pro Asn Gly Lys Val 
405 410 415 

TAT TTA GGT GTT ACG AAA GTT GAT TTT AGT CAA TAT GAT GAT CAA AAA 12 96 

Tyr Leu Gly Val Thr Lys Val Asp Phe Ser Gin Tyr Asp Asp Gin Lys 
420 425 430 

AAT GAA ACT AGT ACA CAA ACA TAT GAT TCA AAA AGA AAC AAT GGC CAT 1344 
Asn Glu Thr Ser Thr Gin Thr Tyr Asp Ser Lys Arg Asn Asn Gly His 
435 440 445 

GTA AGT GCA CAG GAT TCT ATT GAC CAA TTA CCG CCA GAA ACA ACA GAT 13 92 

Val Ser Ala Gin Asp Ser He Asp Gin Leu Pro Pro Glu Thr Thr Asp 
450 455 460 
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GAA CCA CTT GAA AAA GCA TAT AGT CAT CAG CTT AAT TAC GCG GAA TGT 144 0 

Glu Pro Leu Glu Lys Ala Tyr Ser His Gin Leu Asn Tyr Ala Glu Cys 
465 470 475 480 

TTC TTA ATG CAG GAC CGT CGT GGA ACA ATT CCA TTT TTT ACT TGG ACA 14 88 

Phe Leu Met Gin Asp Arg Arg Gly Thr lie Pro Phe Phe Thr Trp Thr 
485 490 495 

CAT AGA AGT GTA GAC TTT TTT AAT ACA ATT GAT GCT GAA AAG ATT ACT 1536 
His Arg Ser Val Asp Phe Phe Asn Thr lie Asp Ala Glu Lys lie Thr 
500 505 510 

CAA CTT CCA GTA GTG AAA GCA TAT GCC TTG TCT TCA GGT GCT TCC ATT 1584 
Gin Leu Pro Val Val Lys Ala Tyr Ala Leu Ser Ser Gly Ala Ser lie 
515 520 525 

ATT GAA GGT CCA GGA TTC ACA GGA GGA AAT TTA CTA TTC CTA AAA GAA 16 32 

lie Glu Gly Pro Gly Phe Thr Gly Gly Asn Leu Leu Phe Leu Lys Glu 
530 535 540 

TCT AGT AAT TCA ATT GCT AAA TTT AAA GTT ACA TTA AAT TCA GCA GCC 1680 
Ser Ser Asn Ser lie Ala Lys Phe Lys Val Thr Leu Asn Ser Ala Ala 
545 550 555 560 

TTG TTA CAA CGA TAT CGT GTA AGA ATA CGC TAT GCT TCT ACC ACT AAC 1728 
Leu Leu Gin Arg Tyr Arg Val Arg lie Arg Tyr Ala Ser Thr Thr Asn 
565 570 575 

TTA CGA CTT TTT GTG CAA AAT TCA AAC AAT GAT TTT CTT GTC ATC TAC 1776 
Leu Arg Leu Phe Val Gin Asn Ser Asn Asn Asp Phe Leu Val lie Tyr 
580 585 590 

ATT AAT AAA ACT ATG AAT AAA GAT GAT GAT TTA ACA TAT CAA ACA TTT 1824 
lie Asn Lys Thr Met Asn Lys Asp Asp Asp Leu Thr Tyr Gin Thr Phe 
595 600 605 

GAT CTC GCA ACT ACT AAT TCT AAT ATG GGG TTC TCG GGT GAT AAG AAT 18 72 

Asp Leu Ala Thr Thr Asn Ser Asn Met Gly Phe Ser Gly Asp Lys Asn 
610 615 620 

GAA CTT ATA ATA GGA GCA GAA TCT TTC GTT TCT AAT GAA AAA ATC TAT 1920 
Glu Leu He He Gly Ala Glu Ser Phe Val Ser Asn Glu Lys He Tyr 
625 630 635 640 

ATA GAT AAG ATA GAA TTT ATC CCA GTA CAA TTG TAA 1956 
He Asp Lys He Glu Phe lie Pro Val Gin Leu 
645 650 



(2) INFORMATION FOR SEQ ID NO: 52: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6 51 amino acids 

(B) TYPE: amino acid 
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( D ) TOPOLOGY : 1 inear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 52: 

Met Asn Pro Asn Asn Arg Ser Glu His Asp Thr lie Lys Val Thr Pro 
15 10 15 

Asn Ser Glu Leu Gin Thr Asn His Asn Gin Tyr Pro Leu Ala Asp Asn 
20 25 30 

Pro Asn Ser Thr Leu Glu Glu Leu Asn Tyr Lys Glu Phe Leu Arg Met 
35 40 45 

Thr Glu Asp Ser Ser Thr Glu Val Leu Asp Asn Ser Thr Val Lys Asp 
50 55 60 

Ala Val Gly Thr Gly lie Ser Val Val Gly Gin He Leu Gly Val Val 
65 70 75 80 

Gly Val Pro Phe Ala Gly Ala Leu Thr Ser Phe Tyr Gin Ser Phe Leu 
85 90 95 

Asn Thr He Trp Pro Ser Glu Asp Pro Trp Lys Ala Phe Met Ala Gin 
100 105 110 

Val Glu Val Leu He Asp Lys Lys He Glu Glu Tyr Ala Lys Ser Lys 
115 120 125 

Ala Leu Ala Glu Leu Gin Gly Leu Gin Asn Asn Phe Glu Asp Tyr Val 
130 135 140 

Asn Ala Leu Asn Ser Trp Lys Lys Thr Pro Leu Ser Leu Arg Ser Lys 
145 150 155 160 

Arg Ser Gin Asp Arg He Arg Glu Leu Phe Ser Gin Ala Glu Ser His 
165 170 175 

Phe Arg Asn Ser Met Pro Ser Phe Ala Val Ser Lys Phe Glu Val Leu 
180 185 190 

Phe Leu Pro Thr Tyr Ala Gin Ala Ala Asn Thr His Leu Leu Leu Leu 
195 200 205 

Lys Asp Ala Gin Val Phe Gly Glu Glu Trp Gly Tyr Ser Ser Glu Asp 
210 215 220 

Val Ala Glu Phe Tyr His Arg Gin Leu Lys Leu Thr Gin Gin Tyr Thr 
225 230 235 240 

Asp His Cys Val Asn Trp Tyr Asn Val Gly Leu Asn Gly Leu Arg Gly 
245 250 255 
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Ser Thr Tyr Asp Ala Trp Val Lys Phe Asn Arg Phe Arg Arg Glu Met 
260 265 270 



Thr Leu Thr Val Leu Asp Leu He Val Leu Phe Pro Phe Tyr Asp He 
275 280 285 

Arg Leii Tyr Ser Lys Gly Val Lys Thr Glu Leu Thr Arg Asp He Phe 
290 295 300 

Thr Asp Pro He Phe Ser Leu Asn Thr Leu Gin Glu Tyr Gly Pro Thr 
305 310 315 320 

Phe Leu Ser lie Glu Asn Ser He Arg Lys Pro His Leu Phe Asp Tyr 
325 330 335 

Leu Gin Gly lie Glu Phe His Thr Arg Leu Gin Pro Gly Tyr Phe Gly 
340 345 350 

Lys Asp Ser Phe Asn Tyr Trp Ser Gly Asn Tyr Val Glu Thr Arg Pro 
355 360 365 

Ser lie Gly Ser Ser Lys Thr lie Thr Ser Pro Phe Tyr Gly Asp Lys 
370 375 380 

Ser Thr Glu Pro Val Gin Lys Leu Ser Phe Asp Gly Gin Lys Val Tyr 
385 390 395 400 

Arg Thr He Ala Asn Thr Asp Val Ala Ala Trp Pro Asn Gly Lys Val 
405 410 415 

Tyr Leu Gly Val Thr Lys Val Asp Phe Ser Gin Tyr Asp Asp Gin Lys 
420 425 430 

Asn Glu Thr Ser Thr Gin Thr Tyr Asp Ser Lys Arg Asn Asn Gly His 
435 440 445 

Val Ser Ala Gin Asp Ser He Asp Gin Leu Pro Pro Glu Thr Thr Asp 
450 455 460 

Glu Pro Leu Glu Lys Ala Tyr Ser His Gin Leu Asn Tyr Ala Glu Cys 
465 470 475 480 

Phe Leu Met Gin Asp Arg Arg Gly Thr lie Pro Phe Phe Thr Trp Thr 
485 490 495 

His Arg Ser Val Asp Phe Phe Asn Thr lie Asp Ala Glu Lys lie Thr 
500 505 510 

Gin Leu Pro Val Val Lys Ala Tyr Ala Leu Ser Ser Gly Ala Ser lie 
515 520 525 

lie Glu Gly Pro Gly Phe Thr Gly Gly Asn Leu Leu Phe Leu Lys Glu 
530 535 540 
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Ser Ser Asn Ser 
545 

Leu Leu Gin Arg 



Leu Arg Leu Phe 
580 

lie Asn Lys Thr 
595 

Asp Leu Ala Thr 
610 

Glu Leu lie He 
625 

He Asp Lys He 



He Ala Lys Phe 
550 

Tyr Arg Val Arg 
565 

Val Gin Asn Ser 



Met Asn Lys Asp 
600 

Thr Asn Ser Asn 
615 

Gly Ala Glu Ser 
630 

Glu Phe He Pro 
645 



Lys Val Thr Leu 
555 

He Arg Tyr Ala 
570 

Asn Asn Asp Phe 
585 

Asp Asp Leu Thr 



Met Gly Phe Ser 
620 

Phe Val Ser Asn 
635 

Val Gin Leu 
650 



Asn Ser Ala Ala 
560 

Ser Thr Thr Asn 
575 

Leu Val He Tyr 
590 

Tyr Gin Thr Phe 
605 

Gly Asp Lys Asn 



Glu Lys He Tyr 
640 



(2) INFORMATION FOR SEQ ID NO: 53: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 195 9 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(ix) FEATURE: 

(A) NAME /KEY : CDS 

(B) LOCATION: 1. .1956 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 53: 



ATG AAT CCA AAC AAT CGA AGT GAA CAT GAT ACG ATA AAG GTT ACA CCT 
Met Asn Pro Asn Asn Arg Ser Glu His Asp Thr He Lys Val Thr Pro 
15 10 15 



AAC AGT GAA TTG CAA ACT AAC CAT AAT CAA TAT CCT TTA GCT GAC AAT 
Asn Ser Glu Leu Gin Thr Asn His Asn Gin Tyr Pro Leu Ala Asp Asn 
20 25 30 



CCA AAT TCA ACA CTA GAA GAA TTA AAT TAT AAA GAA TTT TTA AGA ATG 
Pro Asn Ser Thr Leu Glu Glu Leu Asn Tyr Lys Glu Phe Leu Arg Met 
35 40 45 



ACT GAA GAC AGT TCT ACG GAA GTG CTA GAC AAC TCT ACA GTA AAA GAT 
Thr Glu Asp Ser Ser Thr Glu Val Leu Asp Asn Ser Thr Val Lys Asp 
50 55 60 
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GCA GTT GGG ACA GGA ATT TCT GTT GTA GGG CAG ATT TTA GGT GTT GTA 24 0 

Ala Val Gly Thr Gly lie Ser Val Val Gly Gin He Leu Gly Val Val 
65 70 75 80 

GGA GTT CCA TTT GCT GGG GCA CTC ACT TCA TTT TAT CAA TCA TTT CTT 288 
Gly Val Pro Phe Ala Gly Ala Leu Thr Ser Phe Tyr Gin Ser Phe Leu 
85 90 95 

AAC ACT ATA TGG CCA AGT GAT GCT GAC CCA TGG AAG GCT TTT ATG GCA 336 
Asn Thr He Trp Pro Ser Asp Ala Asp Pro Trp Lys Ala Phe Met Ala 
100 105 110 

CAA GTT GAA GTA CTG ATA GAT AAG AAA ATA GAG GAG TAT GCT AAA AGT 3 84 

Gin Val Glu Val Leu He Asp Lys Lys He Glu Glu Tyr Ala Lys Ser 
115 120 125 

AAA GCT CTT GCA GAG TTA CAG GGT CTT CAA AAT AAT TTC GAA GAT TAT 4 32 

Lys Ala Leu Ala Glu Leu Gin Gly Leu Gin Asn Asn Phe Glu Asp Tyr 
130 135 140 

GTT AAT GCG TTA AAT TCC TGG AAG AAA ACA CCT TTA AGT TTG CGA AGT 480 
Val Asn Ala Leu Asn Ser Trp Lys Lys Thr Pro Leu Ser Leu Arg Ser 
145 150 155 160 

AAA AGA AGC CAA GAT CGA ATA AGG GAA CTT TTT TCT CAA GCA GAA AGT 52 8 

Lys Arg Ser Gin Asp Arg He Arg Glu Leu Phe Ser Gin Ala Glu Ser 
165 170 175 

CAT TTT CGT AAT TCC ATG CCG TCA TTT GCA GTT TCC GGA TTC GAA GTG 5 76 

His Phe Arg Asn Ser Met Pro Ser Phe Ala Val Ser Gly Phe Glu Val 
180 185 190 

CTG TTT CTA CCA ACA TAT GCA CAA GCT GCA AAT ACA CAT TTA TTG CTA 624 
Leu Phe Leu Pro Thr Tyr Ala Gin Ala Ala Asn Thr His Leu Leu Leu 
195 200 205 

TTA AAA GAT GCT CAA GTT TTT GGA GAA GAA TGG GGA TAT TCT TCA GAA 6 72 

Leu Lys Asp Ala Gin Val Phe Gly Glu Glu Trp Gly Tyr Ser Ser Glu 
210 215 220 

GAT GTT GCT GAA TTT TAT CAT AGA CAA TTA AAA CTT ACA CAA CAA TAC 72 0 

Asp Val Ala Glu Phe Tyr His Arg Gin Leu Lys Leu Thr Gin Gin Tyr 
225 230 235 240 

ACT GAC CAT TGT GTT AAT TGG TAT AAT GTT GGA TTA AAT GGT TTA AGA 76 8 

Thr Asp His Cys Val Asn Trp Tyr Asn Val Gly Leu Asn Gly Leu Arg 
245 250 255 

GGT TCA ACT TAT GAT GCA TGG GTC AAA TTT AAC CGT TTT CGC AGA GAA 816 
Gly Ser Thr Tyr Asp Ala Trp Val Lys Phe Asn Arg Phe Arg Arg Glu 
260 265 270 
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ATG ACT TTA ACT GTA TTA GAT CTA ATT GTA CTT TTC CCA TTT TAT GAT 36 4 

Met Thr Leu Thr Val Leu Asp Leu lie Val Leu Phe Pro Phe Tyr Asp 
275 280 285 

ATT CGG TTA TAC TCA AAA GGG GTT AAA ACA GAA CTA ACA AGA GAC ATT 912 
lie Arg Leu Tyr Ser Lys Gly Val Lys Thr Glu Leu Thr Arg Asp lie 
290 295 300 

TTT ACG GAT CCA ATT TTT TCA CTT AAT ACT CTT CAG GAG TAT GGA CCA 96 0 

Phe Thr Asp Pro lie Phe Ser Leu Asn Thr Leu Gin Glu Tyr Gly Pro 
305 310 315 320 

ACT TTT TTG AGT ATA GAA AAC TCT ATT CGA AAA CCT CAT TTA TTT GAT 1008 
Thr Phe Leu Ser lie Glu Asn Ser lie Arg Lys Pro His Leu Phe Asp 
325 330 335 

TAT TTA CAG GGG ATT GAA TTT CAT ACG CGT CTT CAA CCT GGT TAC TTT 1056 
Tyr Leu Gin Gly lie Glu Phe His Thr Arg Leu Gin Pro Gly Tyr Phe 
340 345 350 

GGG AAA GAT TCT TTC AAT TAT TGG TCT GGT AAT TAT GTA GAA ACT AGA 1104 
Gly Lys Asp Ser Phe Asn Tyr Trp Ser Gly Asn Tyr Val Glu Thr Arg 
355 360 365 

CCT AGT ATA GGA TCT AGT AAG ACA ATT ACT TCC CCA TTT TAT GGA GAT 1152 
Pro Ser lie Gly Ser Ser Lys Thr lie Thr Ser Pro Phe Tyr Gly Asp 
370 375 380 

AAA TCT ACT GAA CCT GTA CAA AAG CTA AGC TTT GAT GGA CAA AAA GTT 1200 
Lys Ser Thr Glu Pro Val Gin Lys Leu Ser Phe Asp Gly Gin Lys Val 
385 390 395 400 

TAT CGA ACT ATA GCT AAT ACA GAC GTA GCG GCT TGG CCG AAT GGT AAG 1248 
Tyr Arg Thr lie Ala Asn Thr Asp Val Ala Ala Trp Pro Asn Gly Lys 
405 410 415 

GTA TAT TTA GGT GTT ACG AAA GTT GAT TTT AGT CAA TAT GAT GAT CAA 1296 
Val Tyr Leu Gly Val Thr Lys Val Asp Phe Ser Gin Tyr Asp Asp Gin 
420 425 430 

AAA AAT GAA ACT AGT ACA CAA ACA TAT GAT TCA AAA AGA AAC AAT GGC 1344 
Lys Asn Glu Thr Ser Thr Gin Thr Tyr Asp Ser Lys Arg Asn Asn Gly 
435 440 445 

CAT GTA AGT GCA CAG GAT TCT ATT GAC CAA TTA CCG CCA GAA ACA ACA .1392 
His Val Ser Ala Gin Asp Ser He Asp Gin Leu Pro Pro Glu Thr Thr 
450 455 460 

GAT GAA CCA CTT GAA AAA GCA TAT AGT CAT CAG CTT AAT TAC GCG GAA 1440 
Asp Glu Pro Leu Glu Lys Ala Tyr Ser His Gin Leu Asn Tyr Ala Glu 
465 470 475 480 
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TGT TTC TTA ATG CAG GAC CGT CGT GGA ACA ATT CCA TTT TTT ACT TGG 
Cys Phe Leu Met Gin Asp Arg Arg Gly Thr lie Pro Phe Phe Thr Trp 
485 490 495 



1488 



ACA CAT AGA AGT GTA GAC TTT TTT AAT ACA ATT GAT GCT GAA AAG ATT 153 6 

Thr His Arg Ser Val Asp Phe Phe Asn Thr lie Asp Ala Glu Lys lie 
500 505 510 

ACT CAA CTT CCA GTA GTG AAA GCA TAT GCC TTG TCT TCA GGT GCT TCC 1584 
Thr Gin Leu Pro Val Val Lys Ala Tyr Ala Leu Ser Ser Gly Ala Ser 
515 520 525 

ATT ATT GAA GGT CCA GGA TTC ACA GGA GGA AAT TTA CTA TTC CTA AAA 1632 
lie lie Glu Gly Pro Gly Phe Thr Gly Gly Asn Leu Leu Phe Leu Lys 
530 535 540 

GAA TCT AGT AAT TCA ATT GCT AAA TTT AAA GTT ACA TTA AAT TCA GCA 1680 
Glu Ser Ser Asn Ser lie Ala Lys Phe Lys Val Thr Leu Asn Ser Ala 
545 550 555 560 

GCC TTG TTA CAA CGA TAT CGT GTA AGA ATA CGC TAT GCT TCT ACC ACT 172 8 

Ala Leu Leu Gin Arg Tyr Arg Val Arg lie Arg Tyr Ala Ser Thr Thr 
565 570 575 

AAC TTA CGA CTT TTT GTG CAA AAT TCA AAC AAT GAT TTT CTT GTC ATC 1776 
Asn Leu Arg Leu Phe Val Gin Asn Ser Asn Asn Asp Phe Leu Val lie 
580 585 590 

TAC ATT AAT AAA ACT ATG AAT AAA GAT GAT GAT TTA ACA TAT CAA ACA 1824 
Tyr lie Asn Lys Thr Met Asn Lys Asp Asp Asp Leu Thr Tyr Gin Thr 
595 600 605 

TTT GAT CTC GCA ACT ACT AAT TCT AAT ATG GGG TTC TCG GGT GAT AAG 1872 
Phe Asp Leu Ala Thr Thr Asn Ser Asn Met Gly Phe Ser Gly Asp Lys 
610 615 620 

AAT GAA CTT ATA ATA GGA GCA GAA TCT TTC GTT TCT AAT GAA AAA ATC 1920 
Asn Glu Leu lie lie Gly Ala Glu Ser Phe Val Ser Asn Glu Lys lie 
625 630 635 640 

TAT ATA GAT AAG ATA GAA TTT ATC CCA GTA CAA TTG TAA 1959 
Tyr lie Asp Lys lie Glu Phe lie Pro Val Gin Leu 
645 650 



(2) INFORMATION FOR SEQ ID NO : 54 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 652 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE : protein 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 54 : 



Met Asn Pro Asn Asn Arg Ser Glu His Asp Thr lie Lys Val Thr Pro 
15 10 15 

Asn Ser Glu Leu Gin Thr Asn His Asn Gin Tyr Pro Leu Ala Asp Asn 
20 25 30 

Pro Asn Ser Thr Leu Glu Glu Leu Asn Tyr Lys Glu Phe Leu Arg Met 
35 40 45 

Thr Glu Asp Ser Ser Thr Glu Val Leu Asp Asn Ser Thr Val Lys Asp 
50 55 60 

Ala Val Gly Thr Gly He Ser Val Val Gly Gin He Leu Gly Val Val 
65 70 75 80 

Gly Val Pro Phe Ala Gly Ala Leu Thr Ser Phe Tyr Gin Ser Phe Leu 
85 90 95 

Asn Thr He Trp Pro Ser Asp Ala Asp Pro Trp Lys Ala Phe Met Ala 
100 105 110 

Gin Val Glu Val Leu He Asp Lys Lys He Glu Glu Tyr Ala Lys Ser 
115 120 125 

Lys Ala Leu Ala Glu Leu Gin Gly Leu Gin Asn Asn Phe Glu Asp Tyr 
130 135 140 

Val Asn Ala Leu Asn Ser Trp Lys Lys Thr Pro Leu Ser Leu Arg Ser 
145 150 155 160 

Lys Arg Ser Gin Asp Arg He Arg Glu Leu Phe Ser Gin Ala Glu Ser 
165 170 175 

His Phe Arg Asn Ser Met Pro Ser Phe Ala Val Ser Gly Phe Glu Val 
180 185 190 

Leu Phe Leu Pro Thr Tyr Ala Gin Ala Ala Asn Thr His Leu Leu Leu 
195 200 205 

Leu Lys Asp Ala Gin Val Phe Gly Glu Glu Trp Gly Tyr Ser Ser Glu 
210 215 220 

Asp Val Ala Glu Phe Tyr His Arg Gin Leu Lys Leu Thr Gin Gin Tyr 
225 230 235 240 

Thr Asp His Cys Val Asn Trp Tyr Asn Val Gly Leu Asn Gly Leu Arg 
245 250 255 

Gly Ser Thr Tyr Asp Ala Trp Val Lys Phe Asn Arg Phe Arg Arg Glu 
260 265 270 

Met Thr Leu Thr Val Leu Asp Leu He Val Leu Phe Pro Phe Tyr Asp 
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280 



285 



lie Arg Leu Tyr Ser Lys Gly Val Lys Thr Glu Leu Thr Arg Asp lie 
290 295 300 

Phe Thr Asp Pro lie Phe Ser Leu Asn Thr Leu Gin Glu Tyr Gly Pro 
305 310 315 320 

Thr Phe Leu Ser lie Glu Asn Ser lie Arg Lys Pro His Leu Phe Asp 
325 330 335 

Tyr Leu Gin Gly lie Glu Phe His Thr Arg Leu Gin Pro Gly Tyr Phe 
340 345 350 

Gly Lys Asp Ser Phe Asn Tyr Trp Ser Gly Asn Tyr Val Glu Thr Arg 
355 360 365 

Pro Ser lie Gly Ser Ser Lys Thr lie Thr Ser Pro Phe Tyr Gly Asp 
370 375 380 

Lys Ser Thr Glu Pro Val Gin Lys Leu Ser Phe Asp Gly Gin Lys Val 
385 390 395 400 

Tyr Arg Thr lie Ala Asn Thr Asp Val Ala Ala Trp Pro Asn Gly Lys 
405 410 415 

Val Tyr Leu Gly Val Thr Lys Val Asp Phe Ser Gin Tyr Asp Asp Gin 
420 425 430 

Lys Asn Glu Thr Ser Thr Gin Thr Tyr Asp Ser Lys Arg Asn Asn Gly 
435 440 445 

His Val Ser Ala Gin Asp Ser lie Asp Gin Leu Pro Pro Glu Thr Thr 
450 455 460 

Asp Glu Pro Leu Glu Lys Ala Tyr Ser His Gin Leu Asn Tyr Ala Glu 
465 470 475 480 

Cys Phe Leu Met Gin Asp Arg Arg Gly Thr lie Pro Phe Phe Thr Trp 
485 490 495 

Thr His Arg Ser Val Asp Phe Phe Asn Thr lie Asp Ala Glu Lys lie 
500 505 510 

Thr Gin Leu Pro Val Val Lys Ala Tyr Ala Leu Ser Ser Gly Ala Ser 
515 520 525 

lie lie Glu Gly Pro Gly Phe Thr Gly Gly Asn Leu Leu Phe Leu Lys 
530 535 540 

Glu Ser Ser Asn Ser lie Ala Lys Phe Lys Val Thr Leu Asn Ser Ala 
545 550 555 560 



A I355J5(2WKV0I< DOC) 



-367- 



Ala Leu Leu Gin 



Asn Leu Arg Leu 
580 

Tyr lie Asn Lys 
595 

Phe Asp Leu Ala 
610 , 

Asn Glu Leu lie 
625 

Tyr lie Asp Lys 



Arg Tyr Arg Val 
565 

Phe Val Gin Asn 



Thr Met Asn Lys 
600 

Thr Thr Asn Ser 
615 

He Gly Ala Glu 
630 

He Glu Phe He 
645 



Arg He Arg Tyr 
570 

Ser Asn Asn Asp 
585 

Asp Asp Asp Leu 



Asn Met Gly Phe 
620 

Ser Phe Val Ser 
635 

Pro Val Gin Leu 
650 



Ala Ser Thr Thr 
575 

Phe Leu Val He 
590 

Thr Tyr Gin Thr 
605 

Ser Gly Asp Lys 



Asn Glu Lys He 
640 



(2) INFORMATION FOR SEQ ID NO: 55: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1956 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ix) FEATURE: 

(A) NAME /KEY: CDS 

(B) LOCATION: 1..1953 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 55: 

ATG AAT CCA AAC AAT CGA AGT GAA CAT GAT ACG ATA AAG GTT ACA CCT 4 8 

Met Asn Pro Asn Asn Arg Ser Glu His Asp Thr He Lys Val Thr Pro 
15 10 15 

AAC AGT GAA TTG CAA ACT AAC CAT AAT CAA TAT CCT TTA GCT GAC AAT 96 
Asn Ser Glu Leu Gin Thr Asn His Asn Gin Tyr Pro Leu Ala Asp Asn 
20 25 30 

CCA AAT TCA ACA CTA GAA GAA TTA AAT TAT AAA GAA TTT TTA AGA ATG 144 
Pro Asn Ser Thr Leu Glu Glu Leu Asn Tyr Lys Glu Phe Leu Arg Met 
35 40 45 

ACT GAA GAC AGT TCT ACG GAA GTG CTA GAC AAC TCT ACA GTA AAA GAT 192 
Thr Glu Asp Ser Ser Thr Glu Val Leu Asp Asn Ser Thr Val Lys Asp 
50 55 60 

GCA GTT GGG ACA GGA ATT TCT GTT GTA GGG CAG ATT TTA GGT GTT GTA 24 0 

Ala Val Gly Thr Gly He Ser Val Val Gly Gin He Leu Gly Val Val 
65 70 75 80 
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GGA GTT CCA TTT GCT GGG GCA CTC ACT TCA TTT TAT CAA TCA TTT CTT 28 8 

Gly Val Pro Phe Ala Gly Ala Leu Thr Ser Phe Tyr Gin Ser Phe Leu 
85 90 95 

AAC ACT ATA TGG CCA AGT GAA GAC CCA TGG AAG GCT TTT ATG GCA CAA 33 6 

Asn Thr lie Trp Pro Ser Glu Asp Pro Trp Lys Ala Phe Met Ala Gin 
100 105 110 

GTT GAA GTA CTG ATA GAT AAG AAA ATA GAG GAG TAT GCT AAA AGT AAA 384 
Val Glu Val Leu lie Asp Lys Lys lie Glu Glu Tyr Ala Lys Ser Lys 
115 120 125 

GCT CTT GCA GAG TTA CAG GGT CTT CAA AAT AAT TTC GAA GAT TAT GTT 4 32 

Ala Leu Ala Glu Leu Gin Gly Leu Gin Asn Asn Phe Glu Asp Tyr Val 
130 135 140 

AAT GCG TTA AAT TCC TGG AAG AAA ACA CCT TTA AGT TTG CGA AAT CCA 48 0 

Asn Ala Leu Asn Ser Trp Lys Lys Thr Pro Leu Ser Leu Arg Asn Pro 
145 150 155 160 

CAC AGC CAA GGT CGA ATA AGG GAA CTT TTT TCT CAA GCA GAA AGT CAT 52 8 

His Ser Gin Gly Arg lie Arg Glu Leu Phe Ser Gin Ala Glu Ser His 
165 170 175 

TTT CGT AAT TCC ATG CCG TCA TTT GCA GTT TCC AAA TTC GAA GTG CTG 576 
Phe Arg Asn Ser Met Pro Ser Phe Ala Val Ser Lys Phe Glu Val Leu 
180 185 190 

TTT CTA CCA ACA TAT GCA CAA GCT GCA AAT ACA CAT TTA TTG CTA TTA 624 
Phe Leu Pro Thr Tyr Ala Gin Ala Ala Asn Thr His Leu Leu Leu Leu 
195 200 205 

AAA GAT GCT CAA GTT TTT GGA GAA GAA TGG GGA TAT TCT TCA GAA GAT 6 72 

Lys Asp Ala Gin Val Phe Gly Glu Glu Trp Gly Tyr Ser Ser Glu Asp 
210 215 220 

GTT GCT GAA TTT TAT CAT AGA CAA TTA AAA CTT ACA CAA CAA TAC ACT 720 
Val Ala Glu Phe Tyr His Arg Gin Leu Lys Leu Thr Gin Gin Tyr Thr 
225 230 235 240 

GAC CAT TGT GTT AAT TGG TAT AAT GTT GGA TTA AAT GGT TTA AGA GGT 768 
Asp His Cys Val Asn Trp Tyr Asn Val Gly Leu Asn Gly Leu Arg Gly 
245 250 255 

TCA ACT TAT GAT GCA TGG GTC AAA TTT AAC CGT TTT CGC AGA GAA ATG 816 
Ser Thr Tyr Asp Ala Trp Val Lys Phe Asn Arg Phe Arg Arg Glu Met 
260 265 270 

ACT TTA ACT GTA TTA GAT CTA ATT GTA CTT TTC CCA TTT TAT GAT ATT 864 
Thr Leu Thr Val Leu Asp Leu lie Val Leu Phe Pro Phe Tyr Asp lie 
275 280 285 
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CGG TTA TAC TCA AAA GGG GTT AAA ACA GAA CTA AC A AG A GAC ATT TTT 912 
Arg Leu Tyr Ser Lys Gly Val Lys Thr Glu Leu Thr Arg Asp lie Phe 
290 295 300 

ACG GAT CCA ATT TTT TCA CTT AAT ACT CTT CAG GAG TAT GGA CCA ACT 960 
Thr Asp Pro lie Phe Ser Leu Asn Thr Leu Gin Glu Tyr Gly Pro Thr 
305 310 315 320 

TTT TTG AGT ATA GAA AAC TCT ATT CGA AAA CCT CAT TTA TTT GAT TAT 1008 
Phe Leu Ser lie Glu Asn Ser lie Arg Lys Pro His Leu Phe Asp Tyr 
325 330 335 

TTA CAG GGG ATT GAA TTT CAT ACG CGT CTT CAA CCT GGT TAC TTT GGG 1056 
Leu Gin Gly lie Glu Phe His Thr Arg Leu Gin Pro Gly Tyr Phe Gly 
340 345 350 

AAA GAT TCT TTC AAT TAT TGG TCT GGT AAT TAT GTA GAA ACT AGA CCT 1104 
Lys Asp Ser Phe Asn Tyr Trp Ser Gly Asn Tyr Val Glu Thr Arg Pro 
355 360 365 

AGT ATA GGA TCT AGT AAG ACA ATT ACT TCC CCA TTT TAT GGA GAT AAA 1152 
Ser lie Gly Ser Ser Lys Thr lie Thr Ser Pro Phe Tyr Gly Asp Lys 
370 375 380 

TCT ACT GAA CCT GTA CAA AAG CTA AGC TTT GAT GGA CAA AAA GTT TAT 1200 
Ser Thr Glu Pro Val Gin Lys Leu Ser Phe Asp Gly Gin Lys Val Tyr 
385 390 395 400 

CGA ACT ATA GCT AAT ACA GAC GTA GCG GCT TGG CCG AAT GGT AAG GTA 1248 
Arg Thr lie Ala Asn Thr Asp Val Ala Ala Trp Pro Asn Gly Lys Val 
405 410 415 

TAT TTA GGT GTT ACG AAA GTT GAT TTT AGT CAA TAT GAT GAT CAA AAA 12 96 

Tyr Leu Gly Val Thr Lys Val Asp Phe Ser Gin Tyr Asp Asp Gin Lys 
420 425 430 

AAT GAA ACT AGT ACA CAA ACA TAT GAT TCA AAA AGA AAC AAT GGC CAT 1344 
Asn Glu Thr Ser Thr Gin Thr Tyr Asp Ser Lys Arg Asn Asn Gly His 
435 440 445 

GTA AGT GCA CAG GAT TCT ATT GAC CAA TTA CCG CCA GAA ACA ACA GAT 13 92 

Val Ser Ala Gin Asp Ser lie Asp Gin Leu Pro Pro Glu Thr Thr Asp 
450 455 460 

GAA CCA CTT GAA AAA GCA TAT AGT CAT CAG CTT AAT TAC GCG GAA TGT 144 0 

Glu Pro Leu Glu Lys Ala Tyr Ser His Gin Leu Asn Tyr Ala Glu Cys 
465 470 475 480 

TTC TTA ATG CAG GAC CGT CGT GGA ACA ATT CCA TTT TTT ACT TGG ACA 1488 
Phe Leu Met Gin Asp Arg Arg Gly Thr lie Pro Phe Phe Thr Trp Thr 
485 490 495 
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CAT AGA AGT GTA GAC TTT TTT AAT ACA ATT GAT GCT GAA AAG ATT ACT 
His Arg Ser Val Asp Phe Phe Asn Thr lie Asp Ala Glu Lys lie Thr 
500 505 510 



1536 



CAA CTT CCA GTA GTG AAA GCA TAT GCC TTG TCT TCA GGT GCT TCC ATT 1584 
Gin Leu Pro Val Val Lys Ala Tyr Ala Leu Ser Ser Gly Ala Ser He 
515 520 525 

ATT GAA GGT CCA GGA TTC ACA GGA GGA AAT TTA CTA TTC CTA AAA GAA 163 2 

He Glu Gly Pro Gly Phe Thr Gly Gly Asn Leu Leu Phe Leu Lys Glu 
530 535 540 

TCT AGT AAT TCA ATT GCT AAA TTT AAA GTT ACA TTA AAT TCA GCA GCC 16 80 

Ser Ser Asn Ser He Ala Lys Phe Lys Val Thr Leu Asn Ser Ala Ala 
545 550 555 560 

TTG TTA CAA CGA TAT CGT GTA AGA ATA CGC TAT GCT TCT ACC ACT AAC 172 8 

Leu Leu Gin Arg Tyr Arg Val Arg He Arg Tyr Ala Ser Thr Thr Asn 
565 570 575 

TTA CGA CTT TTT GTG CAA AAT TCA AAC AAT GAT TTT CTT GTC ATC TAC 17 76 

Leu Arg Leu Phe Val Gin Asn Ser Asn Asn Asp Phe Leu Val lie Tyr 
580 585 590 

ATT AAT AAA ACT ATG AAT AAA GAT GAT GAT TTA ACA TAT CAA ACA TTT 1824 
lie Asn Lys Thr Met Asn Lys Asp Asp Asp Leu Thr Tyr Gin Thr Phe 
595 600 605 

GAT CTC GCA ACT ACT AAT TCT AAT ATG GGG TTC TCG GGT GAT AAG AAT 18 72 

Asp Leu Ala Thr Thr Asn Ser Asn Met Gly Phe Ser Gly Asp Lys Asn 
610 615 620 

GAA CTT ATA ATA GGA GCA GAA TCT TTC GTT TCT AAT GAA AAA ATC TAT 192 0 

Glu Leu lie lie Gly Ala Glu Ser Phe Val Ser Asn Glu Lys lie Tyr 
625 630 635 640 

ATA GAT AAG ATA GAA TTT ATC CCA GTA CAA TTG TAA 1956 
He Asp Lys lie Glu Phe He Pro Val Gin Leu 
645 650 



(2) INFORMATION FOR SEQ ID NO: 56: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6 51 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

<ii) MOLECULE TYPE: protein/ 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 56: 

Met Asn Pro Asn Asn Arg Ser Glu His Asp Thr lie Lys Val Thr Pro 
15 10 15 



A. I35555(2WKV0P DOC) 



-37!- 



Asn Ser Glu Leu Gin Thr Asn His Asn Gin Tyr Pro Leu Ala Asp Asn 
20 25 30 



Pro Asn Ser Thr Leu Glu Glu Leu Asn Tyr Lys Glu Phe Leu Arg Met 
35 40 45 

Thr Glu Asp Ser Ser Thr Glu Val Leu Asp Asn Ser Thr Val Lys Asp 
50 55 60 

Ala Val Gly Thr Gly He Ser Val Val Gly Gin He Leu Gly Val Val 
65 70 75 80 

Gly Val Pro Phe Ala Gly Ala Leu Thr Ser Phe Tyr Gin Ser Phe Leu 
85 90 95 

Asn Thr He Trp Pro Ser Glu Asp Pro Trp Lys Ala Phe Met Ala Gin 
100 105 110 

Val Glu Val Leu He Asp Lys Lys He Glu Glu Tyr Ala Lys Ser Lys 
115 120 125 

Ala Leu Ala Glu Leu Gin Gly Leu Gin Asn Asn Phe Glu Asp Tyr Val 
130 135 140 

Asn Ala Leu Asn Ser Trp Lys Lys Thr Pro Leu Ser Leu Arg Asn Pro 
145 150 155 160 

His Ser Gin Gly Arg lie Arg Glu Leu Phe Ser Gin Ala Glu Ser His 
165 170 175 

Phe Arg Asn Ser Met Pro Ser Phe Ala Val Ser Lys Phe Glu Val Leu 
180 185 190 

Phe Leu Pro Thr Tyr Ala Gin Ala Ala Asn Thr His Leu Leu Leu Leu 
195 200 205 

Lys Asp Ala Gin Val Phe Gly Glu Glu Trp Gly Tyr Ser Ser Glu Asp 
210 215 220 

Val Ala Glu Phe Tyr His Arg Gin Leu Lys Leu Thr Gin Gin Tyr Thr 
225 230 235 240 

Asp His Cys Val Asn Trp Tyr Asn Val Gly Leu Asn Gly Leu Arg Gly 
245 250 255 

Ser Thr Tyr Asp Ala Trp Val Lys Phe Asn Arg Phe Arg Arg Glu Met 
260 265 270 

Thr Leu Thr Val Leu Asp Leu He Val Leu Phe Pro Phe Tyr Asp He 
275 280 285 

Arg Leu Tyr Ser Lys Gly Val Lys Thr Glu Leu Thr Arg Asp He Phe 
290 295 300 
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Thr Asp Pro lie Phe 
305 

Phe Leu Ser lie Glu 
325 

Leu Gin Gly lie Glu 
340 

Lys Asp .Ser Phe Asn 
355 

Ser lie Gly Ser Ser 
370 

Ser Thr Glu Pro Val 
385 

Arg Thr lie Ala Asn 
405 

Tyr Leu Gly Val Thr 
420 

Asn Glu Thr Ser Thr 
435 

Val Ser Ala Gin Asp 
450 

Glu Pro Leu Glu Lys 
465 

Phe Leu Met Gin Asp 
485 

His Arg Ser Val Asp 
500 

Gin Leu Pro Val Val 
515 

lie Glu Gly Pro Gly 
530 

Ser Ser Asn Ser lie 
545 

Leu Leu Gin Arg Tyr 
565 

Leu Arg Leu Phe Val 
580 



Ser Leu Asn Thr Leu Gin 
310 315 

Asn Ser lie Arg Lys Pro 
330 

Phe His Thr Arg Leu Gin 
345 

Tyr Trp Ser Gly Asn Tyr 
360 

Lys Thr lie Thr Ser Pro 
375 

Gin Lys Leu Ser Phe Asp 
390 395 

Thr Asp Val Ala Ala Trp 
410 

Lys Val Asp Phe Ser Gin 
425 

Gin Thr Tyr Asp Ser Lys 
440 

Ser lie Asp Gin Leu Pro 
455 

Ala Tyr Ser His Gin Leu 
470 475 

Arg Arg Gly Thr lie Pro 
490 

Phe Phe Asn Thr lie Asp 
505 

Lys Ala Tyr Ala Leu Ser 
520 

Phe Thr Gly Gly Asn Leu 
535 

Ala Lys Phe Lys Val Thr 
550 555 

Arg Val Arg lie Arg Tyr 
570 

Gin Asn Ser Asn Asn Asp 
585 



Glu Tyr Gly Pro Thr 
320 

His Leu Phe Asp Tyr 
335 

Pro Gly Tyr Phe Gly 
350 

Val Glu Thr Arg Pro 
365 

Phe Tyr Gly Asp Lys 
380 

Gly Gin Lys Val Tyr 
400 

Pro Asn Gly Lys Val 
415 

Tyr Asp Asp Gin Lys 
430 

Arg Asn Asn Gly His 
445 

Pro Glu Thr Thr Asp 
460 

Asn Tyr Ala Glu Cys 
480 

Phe Phe Thr Trp Thr 
495 

Ala Glu Lys He Thr 
510 

Ser Gly Ala Ser He 
525 

Leu Phe Leu Lys Glu 
540 

Leu Asn Ser Ala Ala 
560 

Ala Ser Thr Thr Asn 
575 

Phe Leu Val He Tyr 
590 
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lie Asn Lys Thr 
595 

Asp Leu Ala Thr 
610 

Glu Leu He lie 
625 

He Asp Lys He 



Met Asn Lys Asp 
600 

Thr Asn Ser Asn 
615 

Gly Ala Glu Ser 
- 630 

Glu Phe He Pro 
645 



Asp Asp Leu Thr 



Met Gly Phe Ser 
620 

Phe Val Ser Asn 
635 

Val Gin Leu 
650 



Tyr Gin Thr Phe 
605 

Gly Asp Lys Asn 



Glu Lys lie Tyr 
640 



(2) INFORMATION FOR SEQ ID NO: 57: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1956 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ix) FEATURE: 

(A) NAME /KEY : CDS 

(B) LOCATION: 1..1953 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 57: 

ATG AAT CCA AAC AAT CGA AGT GAA CAT GAT ACG ATA AAG GTT ACA CCT 4 8 

Met Asn Pro Asn Asn Arg Ser Glu His Asp Thr He Lys Val Thr Pro 
15 10 15 

AAC AGT GAA TTG CAA ACT AAC CAT AAT CAA TAT CCT TTA GCT GAC AAT 96 
Asn Ser Glu Leu Gin Thr Asn His Asn Gin Tyr Pro Leu Ala Asp Asn 
20 25 30 

CCA AAT TCA ACA CTA GAA GAA TTA AAT TAT AAA GAA TTT TTA AGA ATG 144 
Pro Asn Ser Thr Leu Glu Glu Leu Asn Tyr Lys Glu Phe Leu Arg Met 
35 40 45 

ACT GAA GAC AGT TCT ACG GAA GTG CTA GAC AAC TCT ACA GTA AAA GAT 192 
Thr Glu Asp Ser Ser Thr Glu Val Leu Asp Asn Ser Thr Val Lys Asp 
50 55 60 

GCA GTT GGG ACA GGA ATT TCT GTT GTA GGG CAG ATT TTA GGT GTT GTA 24 0 

Ala Val Gly Thr Gly He Ser Val Val Gly Gin lie Leu Gly Val Val 
65 70 75 80 

GGA GTT CCA TTT GCT GGG GCA CTC ACT TCA TTT TAT CAA TCA TTT CTT 288 
Gly Val Pro Phe Ala Gly Ala Leu Thr Ser Phe Tyr Gin Ser Phe Leu 
85 90 95 
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AAC ACT ATA TGG CCA AGT GAA GAC CCA TGG AAG GCT TTT ATG GCA CAA 336 
Asn Thr He Trp Pro Ser Glu Asp Pro Trp Lys Ala Phe Met Ala Gin 
100 105 110 

GTT GAA GTA CTG ATA GAT AAG AAA ATA GAG GAG TAT GCT AAA AGT AAA 384 
Val Glu Val Leu He Asp Lys Lys He Glu Glu Tyr Ala Lys Ser Lys 
115 120 125 

GCT CTT GCA GAG TTA CAG GGT CTT CAA AAT AAT TTC GAA GAT TAT GTT 432 
Ala Leu Ala Glu Leu Gin Gly Leu Gin Asn Asn Phe Glu Asp Tyr Val * 
130 135 140 

AAT GCG TTA AAT TCC TGG AAG AAA TTT CAC CAT TCT CGT CGT TCT AAA 48 0 

Asn Ala Leu Asn Ser Trp Lys Lys Phe His His Ser Arg Arg Ser Lys 
145 150 ' 155 160 

AGA AGC CAA GAT CGA ATA AGG GAA CTT TTT TCT CAA GCA GAA AGT CAT 52 8 

Arg Ser Gin Asp Arg He Arg Glu Leu Phe Ser Gin Ala Glu Ser His 
165 170 175 

TTT CGT AAT TCC ATG CCG TCA TTT GCA GTT TCC AAA TTC GAA GTG CTG 5 76 

Phe Arg Asn Ser Met Pro Ser Phe Ala Val Ser Lys Phe Glu Val Leu 
180 185 190 

TTT CTA CCA ACA TAT GCA CAA GCT GCA AAT ACA CAT TTA TTG CTA TTA 624 
Phe Leu Pro Thr Tyr Ala Gin Ala Ala Asn Thr His Leu Leu Leu Leu 
195 200 205 

AAA GAT GCT CAA GTT TTT GGA GAA GAA TGG GGA TAT TCT TCA GAA GAT 672 
Lys Asp Ala Gin Val Phe Gly Glu Glu Trp Gly Tyr Ser Ser Glu Asp 
210 215 220 

GTT GCT GAA TTT TAT CAT AGA CAA TTA AAA CTT ACA CAA CAA TAC ACT 72 0 

Val Ala Glu Phe Tyr His Arg Gin Leu Lys Leu Thr Gin Gin Tyr Thr 
225 230 235 240 

GAC CAT TGT GTT AAT TGG TAT AAT GTT GGA TTA AAT GGT TTA AGA GGT 76 8 

Asp His Cys Val Asn Trp Tyr Asn Val Gly Leu Asn Gly Leu Arg Gly 
245 250 255 

TCA ACT TAT GAT GCA TGG GTC AAA TTT AAC CGT TTT CGC AGA GAA ATG 816 
Ser Thr Tyr Asp Ala Trp Val Lys Phe Asn Arg Phe Arg Arg Glu Met 
260 265 270 

ACT TTA ACT GTA TTA GAT CTA ATT GTA CTT TTC CCA TTT TAT GAT ATT 864 
Thr Leu Thr Val Leu Asp Leu He Val Leu Phe Pro Phe Tyr Asp He 
275 280 285 

CGG TTA TAC TCA AAA GGG GTT AAA ACA GAA CTA ACA AGA GAC ATT TTT 912 
Arg Leu Tyr Ser Lys Gly Val Lys Thr Glu Leu Thr Arg Asp He Phe 
290 295 300 
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ACG GAT CCA ATT TTT TCA CTT AAT ACT CTT CAG GAG TAT GGA CCA ACT 960 
Thr Asp Pro lie Phe Ser Leu Asn Thr Leu Gin Glu Tyr Gly Pro Thr 
305 310 315 320 

TTT TTG AGT ATA GAA AAC TCT ATT CGA AAA CCT CAT TTA TTT GAT TAT 1008 
Phe Leu Ser lie Glu Asn Ser lie Arg Lys Pro His Leu Phe Asp Tyr 
325 330 . 335 

TTA CAG GGG ATT GAA TTT CAT ACG CGT CTT CAA CCT GGT TAC TTT GGG 1056 
Leu Gin Gly lie Glu Phe His Thr Arg Leu Gin Pro Gly Tyr Phe Gly 
340 345 350 

AAA GAT TCT TTC AAT TAT TGG TCT GGT AAT TAT GTA GAA ACT AGA CCT 1104 
Lys Asp Ser Phe Asn Tyr Trp Ser Gly Asn Tyr Val Glu Thr Arg Pro 
355 360 365 

AGT ATA GGA TCT AGT AAG ACA ATT ACT TCC CCA TTT TAT GGA GAT AAA 1152 
Ser lie Gly Ser Ser Lys Thr lie Thr Ser Pro Phe Tyr Gly Asp Lys 
370 375 380 

TCT ACT GAA CCT GTA CAA AAG CTA AGC TTT GAT GGA CAA AAA GTT TAT 12 00 

Ser Thr Glu Pro Val Gin Lys Leu Ser Phe Asp Gly Gin Lys Val Tyr 
385 390 395 400 

CGA ACT ATA GCT AAT ACA GAC GTA GCG GCT TGG CCG AAT GGT AAG GTA 124 8 

Arg Thr lie Ala Asn Thr Asp Val Ala Ala Trp Pro Asn Gly Lys Val 
405 410 415 

TAT TTA GGT GTT ACG AAA GTT GAT TTT AGT CAA TAT GAT GAT CAA AAA 1296 
Tyr Leu Gly Val Thr Lys Val Asp Phe Ser Gin Tyr Asp Asp Gin Lys 
420 425 430 

AAT GAA ACT AGT ACA CAA ACA TAT GAT TCA AAA AGA AAC AAT GGC CAT 1344 
Asn Glu Thr Ser Thr Gin Thr Tyr Asp Ser Lys Arg Asn Asn Gly His 
435 440 445 

GTA AGT GCA CAG GAT TCT ATT GAC CAA TTA CCG CCA GAA ACA ACA GAT 13 92 

Val Ser Ala Gin Asp Ser lie Asp Gin Leu Pro Pro Glu Thr Thr Asp 
450 455 460 

GAA CCA CTT GAA AAA GCA TAT AGT CAT CAG CTT AAT TAC GCG GAA TGT 1440 
Glu Pro Leu Glu Lys Ala Tyr Ser His Gin Leu Asn Tyr Ala Glu Cys 
465 470 475 480 

TTC TTA ATG CAG GAC CGT CGT GGA ACA ATT CCA TTT TTT ACT TGG ACA 14 88 

Phe Leu Met Gin Asp Arg Arg Gly Thr lie Pro Phe Phe Thr Trp Thr 
485 490 495 

CAT AGA AGT GTA GAC TTT TTT AAT ACA ATT GAT GCT GAA AAG ATT ACT 1536 
His Arg Ser Val Asp Phe Phe Asn Thr lie Asp Ala Glu Lys lie Thr 
500 505 510 



A I35535(2WKV0I* DOC) 



-376- 



CAA CTT CCA GTA GTG AAA GCA TAT GCC TTG TCT TCA GGT GCT TCC ATT 
Gin Leu Pro Val Val Lys Ala Tyr Ala Leu Ser Ser Gly Ala Ser lie 
515 520 525 



1584 



ATT GAA GGT CCA GGA TTC ACA GGA GGA AAT TTA CTA TTC CTA AAA GAA 1632 
He Glu Gly Pro Gly Phe Thr Gly Gly Asn Leu Leu Phe Leu Lys Glu 
530 535 540 

TCT AGT AAT TCA ATT GCT AAA TTT AAA GTT ACA TTA AAT TCA GCA GCC 168 0 

Ser Ser Asn Ser lie Ala Lys Phe Lys Val Thr Leu Asn Ser Ala Ala 
545 550 555 560 

TTG TTA CAA CGA TAT CGT GTA AGA ATA CGC TAT GCT TCT ACC ACT AAC 172 8 

Leu Leu Gin Arg Tyr Arg Val Arg He Arg Tyr Ala Ser Thr Thr Asn 
565 570 575 

TTA CGA CTT TTT GTG CAA AAT TCA AAC AAT GAT TTT CTT GTC ATC TAC 17 76 

Leu Arg Leu Phe Val Gin Asn Ser Asn Asn Asp Phe Leu Val He Tyr 
580 585 590 

ATT AAT AAA ACT ATG AAT AAA GAT GAT GAT TTA ACA TAT CAA ACA TTT 1824 
He Asn Lys Thr Met Asn Lys Asp Asp Asp Leu Thr Tyr Gin Thr Phe 
595 600 605 

GAT CTC GCA ACT ACT AAT TCT AAT ATG GGG TTC TCG GGT GAT AAG AAT 18 72 

Asp Leu Ala Thr Thr Asn Ser Asn Met Gly Phe Ser Gly Asp Lys Asn 
610 615 620 

GAA CTT ATA ATA GGA GCA GAA TCT TTC GTT TCT AAT GAA AAA ATC TAT 1920 
Glu Leu He He Gly Ala Glu Ser Phe Val Ser Asn Glu Lys He Tyr 
625 630 635 640 

ATA GAT AAG ATA GAA TTT ATC CCA GTA CAA TTG TAA 1956 
He Asp Lys He Glu Phe He Pro Val Gin Leu 
645 650 



(2) INFORMATION FOR SEQ ID NO: 58: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6 51 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 58: 

Met Asn Pro Asn Asn Arg Ser Glu His Asp Thr He Lys Val Thr Pro 
15 10 15 

Asn Ser Glu Leu Gin Thr Asn His Asn Gin Tyr Pro Leu Ala Asp Asn 
20 25 30 
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Pro Asn Ser Thr Leu Glu Glu Leu Asn Tyr Lys Glu Phe Leu Arg Met 
35 40 45 



Thr Glu Asp Ser Ser Thr Glu Val Leu Asp Asn Ser Thr Val Lys Asp 
50 55 60 

Ala Val Gly Thr Gly He Ser Val Val Gly Gin He Leu Gly Val Val 
65 70 75 80 

Gly Val Pro Phe Ala Gly Ala Leu Thr Ser Phe Tyr Gin Ser Phe Leu 
85 90 95 

Asn Thr He Trp Pro Ser Glu Asp Pro Trp Lys Ala Phe Met Ala Gin 
100 105 110 

Val Glu Val Leu He Asp Lys Lys He Glu Glu Tyr Ala Lys Ser Lys 
115 120 125 

Ala Leu Ala Glu Leu Gin Gly Leu Gin Asn Asn Phe Glu Asp Tyr Val 
130 135 140 

Asn Ala Leu Asn Ser Trp Lys Lys Phe His His Ser Arg Arg Ser Lys 
145 150 155 160 

Arg Ser Gin Asp Arg He Arg Glu Leu Phe Ser Gin Ala Glu Ser His 
165 170 175 

Phe Arg Asn Ser Met Pro Ser Phe Ala Val Ser Lys Phe Glu Val Leu 
180 185 190 

Phe Leu Pro Thr Tyr Ala Gin Ala Ala Asn Thr His Leu Leu Leu Leu 
195 200 205 

Lys Asp Ala Gin Val Phe Gly Glu Glu Trp Gly Tyr Ser Ser Glu Asp 
210 215 220 

Val Ala Glu Phe Tyr His Arg Gin Leu Lys Leu Thr Gin Gin Tyr Thr 
225 230 235 240 

Asp His Cys Val Asn Trp Tyr Asn Val Gly Leu Asn Gly Leu Arg Gly 
245 250 255 

Ser Thr Tyr Asp Ala Trp Val Lys Phe Asn Arg Phe Arg Arg Glu Met 
260 265 270 

Thr Leu Thr Val Leu Asp Leu He Val Leu Phe Pro Phe Tyr Asp He 
275 280 285 

Arg Leu Tyr Ser Lys Gly Val Lys Thr Glu Leu Thr Arg Asp He Phe 
290 295 300 

Thr Asp Pro He Phe Ser Leu Asn Thr Leu Gin Glu Tyr Gly Pro Thr 
305 310 315 320 
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Phe Leu Ser lie Glu Asn Ser lie Arg Lys Pro His Leu Phe Asp Tyr 
325 330 335 



Leu Gin Gly He Glu Phe His Thr Arg Leu Gin Pro Gly Tyr Phe Gly 
340 345 350 

Lys Asp Ser Phe Asn Tyr Trp Ser Gly Asn Tyr Val Glu Thr Arg Pro 
355 360 365 

Ser He Gly Ser Ser Lys Thr lie Thr Ser Pro Phe Tyr Gly Asp Lys 
370 375 380 

Ser Thr Glu Pro Val Gin Lys Leu Ser Phe Asp Gly Gin Lys Val Tyr 
385 390 395 400 

Arg Thr He Ala Asn Thr Asp Val Ala Ala Trp Pro Asn Gly Lys Val 
405 410 415 

Tyr Leu Gly Val Thr Lys Val Asp Phe Ser Gin Tyr Asp Asp Gin Lys 
420 425 430 

Asn Glu Thr Ser Thr Gin Thr Tyr Asp Ser Lys Arg Asn Asn Gly His 
435 440 445 

Val Ser A-a Gin Asp Ser He Asp Gin Leu Pro Pro Glu Thr Thr Asp 
450 455 460 

Glu Pro Leu Glu Lys Ala Tyr Ser His Gin Leu Asn Tyr Ala Glu Cys 
465 470 475 480 

Phe Leu Met Gin Asp Arg Arg Gly Thr He Pro Phe Phe Thr Trp Thr 
485 490 495 

His Arg Ser Val Asp Phe Phe Asn Thr He Asp Ala Glu Lys He Thr 
500 505 510 

Gin Leu Pro Val Val Lys Ala Tyr Ala Leu Ser Ser Gly Ala Ser He 
515 520 525 

He Glu Gly Pro Gly Phe Thr Gly Gly Asn Leu Leu Phe Leu Lys Glu 
530 535 540 

Ser Ser Asn Ser He Ala Lys Phe Lys Val Thr Leu Asn Ser Ala Ala 
545 550 555 560 

Leu Leu Gin Arg Tyr Arg Val Arg He Arg Tyr Ala Ser Thr Thr Asn 
565 570 575 

Leu Arg Leu Phe Val Gin Asn Ser Asn Asn Asp Phe Leu Val lie Tyr 
580 585 590 



He Asn Lys Thr Met Asn Lys Asp Asp Asp Leu Thr Tyr Gin Thr Phe 
595 600 605 
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Asp Leu Ala Thr Thr Asn Ser Asn Met Gly Phe Ser Gly Asp Lys Asn 
610 615 620 



Glu Leu lie lie Gly Ala Glu Ser Phe Val Ser Asn Glu Lys lie Tyr 
625 630 635 640 

lie Asp Lys lie Glu Phe lie Pro Val Gin Leu 

645 650 



(2) INFORMATION FOR SEQ ID NO: 59: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1959 base pairs 

(B) TYPE: nucleic "acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ix) FEATURE: 

(A) NAME /KEY : CDS 

(B) LOCATION: 1. . 1956 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 59: 

ATG AAT CCA AAC AAT CGA AGT GAA CAT GAT ACG ATA AAG GTT ACA CCT 4 8 

Met Asn Pro Asn Asn Arg Ser Glu His Asp Thr lie Lys Val Thr Pro 
15 10 15 

AAC AGT GAA TTG CAA ACT AAC CAT AAT CAA TAT CCT TTA GCT GAC AAT 96 
Asn Ser Glu Leu Gin Thr Asn His Asn Gin Tyr Pro Leu Ala Asp Asn 
20 25 30 

CCA AAT TCA ACA CTA GAA GAA TTA AAT TAT AAA GAA TTT TTA AGA ATG 144 
Pro Asn Ser Thr Leu Glu Glu Leu Asn Tyr Lys Glu Phe Leu Arg Met 
35 40 45 

ACT GAA GAC AGT TCT ACG GAA GTG CTA GAC AAC TCT ACA GTA AAA GAT 192 
Thr Glu Asp Ser Ser Thr Glu Val Leu Asp Asn Ser Thr Val Lys Asp 
50 55 60 

GCA GTT GGG ACA GGA ATT TCT GTT GTA GGG CAG ATT TTA GGT GTT GTA 240 
Ala Val Gly Thr Gly He Ser Val Val Gly Gin He Leu Gly Val Val 
65 70 75 80 

GGA GTT CCA TTT GCT GGG GCA CTC ACT TCA TTT TAT CAA TCA TTT CTT 288 
Gly Val Pro Phe Ala Gly Ala Leu Thr Ser Phe Tyr Gin Ser Phe Leu 
85 90 95 

AAC ACT ATA TGG CCA AGT GAT GCT GAC CCA TGG AAG GCT TTT ATG GCA 336 
Asn Thr He Trp Pro Ser Asp Ala Asp Pro Trp Lys Ala Phe Met Ala 
100 105 110 
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CAA GTT GAA GTA CTG ATA GAT AAG AAA ATA GAG GAG TAT GCT AAA AGT 384 
Gin Val Glu Val Leu He Asp Lys Lys He Glu Glu Tyr Ala Lys Ser 
115 120 125 

AAA GCT CTT GCA GAG TTA CAG GGT CTT CAA AAT AAT TTC GAA GAT TAT 4 32 

Lys Ala Leu Ala Glu Leu Gin Gly Leu Gin Asn Asn Phe Glu Asp Tyr 
130 135 140 

GTT AAT GCG TTA AAT TCC TGG AAG AAA ACA CCT TTA AGT TTG CGA AGT 480 
Val Asn Ala Leu Asn Ser Trp Lys Lys Thr Pro Leu Ser Leu Arg Ser 
145 150 155 160 

AAA AGA AGC CAA GGT CGA ATA AGG GAA CTT TTT TCT CAA GCA GAA AGT 52 8 

Lys Arg Ser Gin Gly Arg He Arg Glu Leu Phe Ser Gin Ala Glu Ser 
165 170 175 

CAT TTT CGT AAT TCC ATG CCG TCA TTT GCA GTT TCC AAA TTC GAA GTG 576 
His Phe Arg Asn Ser Met Pro Ser Phe Ala Val Ser Lys Phe Glu Val 
180 185 190 

CTG TTT CTA CCA ACA TAT GCA CAA GCT GCA AAT ACA CAT TTA TTG CTA 624 
Leu Phe Leu Pro Thr Tyr Ala Gin Ala Ala Asn Thr His Leu Leu Leu 
195 200 205 

TTA AAA GAT GCT CAA GTT TTT GGA GAA GAA TGG GGA TAT TCT TCA GAA 672 
Leu Lys Asp Ala Gin Val Phe Gly Glu Glu Trp Gly Tyr Ser Ser Glu 
210 215 220 

GAT GTT GCT GAA TTT TAT CAT AGA CAA TTA AAA CTT ACA CAA CAA TAC 720 
Asp Val Ala Glu Phe Tyr His Arg Gin Leu Lys Leu Thr Gin Gin Tyr 
225 230 235 240 

ACT GAC CAT TGT GTT AAT TGG TAT AAT GTT GGA TTA AAT GGT TTA AGA 768 
Thr Asp His Cys Val Asn Trp Tyr Asn Val Gly Leu Asn Gly Leu Arg 
245 250 255 

GGT TCA ACT TAT GAT GCA TGG GTC AAA TTT AAC CGT TTT CGC AGA GAA 816 
Gly Ser Thr Tyr Asp Ala Trp Val Lys Phe Asn Arg Phe Arg Arg Glu 
260 265 270 

ATG ACT TTA ACT GTA TTA GAT CTA ATT GTA CTT TTC CCA TTT TAT GAT 864 
Met Thr Leu Thr Val Leu Asp Leu He Val Leu Phe Pro Phe Tyr Asp 
275 280 285 

ATT CGG TTA TAC TCA AAA GGG GTT AAA ACA GAA CTA ACA AGA GAC ATT 912 
He Arg Leu Tyr Ser Lys Gly Val Lys Thr Glu Leu Thr Arg Asp He 
290 295 300 

TTT ACG GAT CCA ATT TTT ACC CTT AAT ACA CTA CAG AAG TAC GGA CCA 960 
Phe Thr Asp Pro He Phe Thr Leu Asn Thr Leu Gin Lys Tyr Gly Pro 
305 310 315 320 
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ACT TTT TTG AGT ATA GAA AAC TCT ATT CGA AAA CCT CAT TTA TTT GAT 100 8 

Thr Phe Leu Ser lie Glu Asn Ser lie Arg Lys Pro His Leu Phe Asp 
325 330 335 

TAT TTA CAG GGG ATT GAA TTT CAT ACG CGT CTT CAA CCT GGT TAC TTT 1056 
Tyr Leu Gin Gly lie Glu Phe His Thr Arg Leu Gin Pro Gly Tyr Phe 
340 345 350 

GGG AAA GAT TCT TTC AAT TAT TGG TCT GGT AAT TAT GTA GAA ACT AGA 1104 
Gly Lys Asp Ser Phe Asn Tyr Trp Ser Gly Asn Tyr Val Glu Thr Arg 
355 360 365 

CCT AGT ATA GGA TCT AGT AAG ACA ATT ACT TCC CCA TTT TAT GGA GAT 1152 
Pro Ser lie Gly Ser Ser Lys Thr lie Thr Ser Pro Phe Tyr Gly Asp 
370 375 380 

AAA TCT ACT GAA CCT GTA CAA AAG CTA AGC TTT GAT GGA CAA AAA GTT 12 00 

Lys Ser Thr Glu Pro Val Gin Lys Leu Ser Phe Asp Gly Gin Lys Val 
385 390 395 400 

TAT CGA ACT ATA GCT AAT ACA GAC GTA GCG GCT TGG CCG AAT GGT AAG 124 8 

Tyr Arg Thr lie Ala Asn Thr Asp Val Ala Ala Trp Pro Asn Gly Lys 
405 410 415 

GTA TAT TTA GGT GTT ACG AAA GTT GAT TTT AGT CAA TAT GAT GAT CAA 12 96 

Val Tyr Leu Gly Val Thr Lys Val Asp Phe Ser Gin Tyr Asp Asp Gin 
420 425 430 

AAA AAT GAA ACT AGT ACA CAA ACA TAT GAT TCA AAA AGA AAC AAT GGC 1344 
Lys Asn Glu Thr Ser Thr Gin Thr Tyr Asp Ser Lys Arg Asn Asn Gly 
435 440 445 

CAT GTA AGT GCA CAG GAT TCT ATT GAC CAA TTA CCG CCA GAA ACA ACA 13 92 

His Val Ser Ala Gin Asp Ser lie Asp Gin Leu Pro Pro Glu Thr Thr 
450 455 460 

GAT GAA CCA CTT GAA AAA GCA TAT AGT CAT CAG CTT AAT TAC GCG GAA 1440 
Asp Glu Pro Leu Glu Lys Ala Tyr Ser His Gin Leu Asn Tyr Ala Glu 
465 470 475 480 

TGT TTC TTA ATG CAG GAC CGT CGT GGA ACA ATT CCA TTT TTT ACT TGG 1488 
Cys Phe Leu Met Gin Asp Arg Arg Gly Thr lie Pro Phe Phe Thr Trp 
485 490 495 

ACA CAT AGA AGT GTA GAC TTT TTT AAT ACA ATT GAT GCT GAA AAG ATT 1536 
Thr His Arg Ser Val Asp Phe Phe Asn Thr lie Asp Ala Glu Lys lie 
500 505 510 

ACT CAA CTT CCA GTA GTG AAA GCA TAT GCC TTG TCT TCA GGT GCT TCC 1584 
Thr Gin Leu Pro Val Val Lys Ala Tyr Ala Leu Ser Ser Gly Ala Ser 
515 520 525 
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ATT ATT GAA GGT CCA GGA TTC ACA GGA GGA AAT TTA CTA TTC CTA AAA 16 3 2 

lie lie Glu Gly Pro Gly Phe Thr Gly Gly Asn Leu Leu Phe Leu Lys 
530 535 540 

GAA TCT AGT AAT TCA ATT GCT AAA TTT AAA GTT ACA TTA AAT TCA GCA 16 8 0 

Glu Ser Ser Asn Ser lie Ala Lys Phe Lys Val Thr Leu Asn Ser Ala 
545 550 555 560 

GCC TTG TTA CAA CGA TAT CGT GTA AGA ATA CGC TAT GCT TCT ACC ACT 1728 
Ala Leu Leu Gin Arg Tyr Arg Val Arg lie Arg Tyr Ala Ser Thr Thr 
565 570 575 

AAC TTA CGA CTT TTT GTG CAA AAT TCA AAC AAT GAT TTT CTT GTC ATC 17 76 

Asn Leu Arg Leu Phe Val Gin Asn Ser Asn Asn Asp Phe Leu Val lie 
580 585 590 

TAC ATT AAT AAA ACT ATG AAT AAA GAT GAT GAT TTA ACA TAT CAA ACA 1824 
Tyr lie Asn Lys Thr Met Asn Lys Asp Asp Asp Leu Thr Tyr Gin Thr 
595 600 605 

TTT GAT CTC GCA ACT ACT AAT TCT AAT ATG GGG TTC TCG GGT GAT AAG 18 72 

Phe Asp Leu Ala Thr Thr Asn Ser Asn Met Gly Phe Ser Gly Asp Lys 
610 615 620 

AAT GAA CTT ATA ATA GGA GCA GAA TCT TTC GTT TCT AAT GAA AAA ATC 1920 
Asn Glu Leu lie lie Gly Ala Glu Ser Phe Val Ser Asn Glu Lys lie 
625 630 635 640 

TAT ATA GAT AAG ATA GAA TTT ATC CCA GTA CAA TTG TAA 1959 
Tyr lie Asp Lys lie Glu Phe lie Pro Val Gin Leu 
645 650 



(2) INFORMATION FOR SEQ ID NO: 60: 

{i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6 52 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 60: 

Met Asn Pro Asn Asn Arg Ser Glu His Asp Thr lie Lys Val Thr Pro 
15 10 15 

Asn Ser Glu Leu Gin Thr Asn His Asn Gin Tyr Pro Leu Ala Asp Asn 
20 25 * 30 

Pro Asn Ser Thr Leu Glu Glu Leu Asn Tyr Lys Glu Phe Leu Arg Met 
35 40 45 
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Thr Glu Asp Ser Ser Thr Glu Val Leu Asp Asn Ser Thr Val Lys Asp 
50 55 60 



Ala Val Gly Thr Gly lie Ser Val Val Gly Gin He Leu Gly Val Val 
65 70 75 80 

Gly Val Pro Phe Ala Gly Ala Leu Thr Ser Phe Tyr Gin Ser Phe Leu 
85 90 95 

Asn Thr He Trp Pro Ser Asp Ala Asp Pro Trp Lys Ala Phe Met Ala 
100 105 110 

Gin Val Glu Val Leu He Asp Lys Lys He Glu Glu Tyr Ala Lys Ser 
115 120 125 

Lys Ala Leu Ala Glu Leu Gin Gly Leu Gin Asn Asn Phe Glu Asp Tyr 
130 135 140 

Val Asn Ala Leu Asn Ser Trp Lys Lys Thr Pro Leu Ser Leu Arg Ser 
145 150 155 160 

Lys Arg Ser Gin Gly Arg He Arg Glu Leu Phe Ser Gin Ala Glu Ser 
165 170 175 

His Phe Arg Asn Ser Met Pro Ser Phe Ala Val Ser Lys Phe Glu Val 
180 185 190 

Leu Phe Leu Pro Thr Tyr Ala Gin Ala Ala Asn Thr His Leu Leu Leu 
195 200 205 

Leu Lys Asp Ala Gin Val Phe Gly Glu Glu Trp Gly Tyr Ser Ser Glu 
210 215 220 

Asp Val Ala Glu Phe Tyr His Arg Gin Leu Lys Leu Thr Gin Gin Tyr 
225 230 235 240 

Thr Asp His Cys Val Asn Trp Tyr Asn Val Gly Leu Asn Gly Leu Arg 
245 250 255 

Gly Ser Thr Tyr Asp Ala Trp Val Lys Phe Asn Arg Phe Arg Arg Glu 
260 265 270 

Met Thr Leu Thr Val Leu Asp Leu He Val Leu Phe Pro Phe Tyr Asp 
275 280 285 

He Arg Leu Tyr Ser Lys Gly Val Lys Thr Glu Leu Thr Arg Asp He 
290 295 300 

Phe Thr Asp Pro lie Phe Thr Leu Asn Thr Leu Gin Lys Tyr Gly Pro 
305 310 315 320 

Thr Phe Leu Ser He Glu Asn Ser He Arg Lys Pro His Leu Phe Asp 
325 330 335 
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Tyr Leu Gin Gly lie Glu Phe His Thr Arg Leu Gin Pro Gly Tyr Phe 
340 345 350 



Gly Lys Asp Ser Phe Asn Tyr Trp Ser Gly Asn Tyr Val Glu Thr Arg 
355 360 365 

Pro Ser He Gly Ser Ser Lys Thr lie Thr Ser Pro Phe Tyr Gly Asp 
370 375 380 

Lys Ser Thr Glu Pro Val Gin Lys Leu Ser Phe Asp Gly Gin Lys Val 
385 390 395 400 

Tyr Arg Thr He Ala Asn Thr Asp Val Ala Ala Trp Pro Asn Gly Lys 
405 410 415 

Val Tyr Leu Gly Val Thr Lys Val Asp Phe Ser Gin Tyr Asp Asp Gin 
420 425 430 

Lys Asn Glu Thr Ser Thr Gin Thr Tyr Asp Ser Lys Arg Asn Asn Gly 
435 440 445 

His Val Ser Ala Gin Asp Ser He Asp Gin Leu Pro Pro Glu Thr Thr 
450 455 460 

Asp Glu Pro Leu Glu Lys Ala Tyr Ser His Gin Leu Asn Tyr Ala Glu 
465 470 475 480 

Cys Phe Leu Met Gin Asp Arg Arg Gly Thr He Pro Phe Phe Thr Trp 
485 490 495 

Thr His Arg Ser Val Asp Phe Phe Asn Thr He Asp Ala Glu Lys He 
500 505 510 

Thr Gin Leu Pro Val Val Lys Ala Tyr Ala Leu Ser Ser Gly Ala Ser 
515 520 525 

He He Glu Gly Pro Gly Phe Thr Gly Gly Asn Leu Leu Phe Leu Lys 
530 535 540 

Glu Ser Ser Asn Ser He Ala Lys Phe Lys Val Thr Leu Asn Ser Ala 
545 550 555 560 

Ala Leu Leu Gin Arg Tyr Arg Val Arg lie Arg Tyr Ala Ser Thr Thr 
565 570 575 

Asn Leu Arg Leu Phe Val Gin Asn Ser Asn Asn Asp Phe Leu Val lie 
580 585 590 

Tyr He Asn Lys Thr Met Asn Lys Asp Asp Asp Leu Thr Tyr Gin Thr 
595 600 605 

Phe Asp Leu Ala Thr Thr Asn Ser Asn Met Gly Phe Ser Gly Asp Lys 
610 615 620 



-385- 

A 135535<2WKV01» DOC) 



Asn Glu Leu lie lie Gly Ala Glu Ser Phe Val Ser Asn Glu Lys lie 
625 630 635 640 

Tyr He Asp Lys He Glu Phe He Pro Val Gin Leu 
645 650 



(2) INFORMATION FOR SEQ ID NO : 6 1 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1959 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ix) FEATURE: 

(A) NAME /KEY : CDS 

(B) LOCATION: 1 . . 1956 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 61: 

ATG AAT CCA AAC AAT CGA AGT GAA CAT GAT ACG ATA AAG GTT ACA CCT 4 8 

Met Asn Pro Asn Asn Arg Ser Glu His Asp Thr He Lys Val Thr Pro 
15 10 15 

AAC AGT GAA TTG CAA ACT AAC CAT AAT CAA TAT CCT TTA GCT GAC AAT 96 
Asn Ser Glu Leu Gin Thr Asn His Asn Gin Tyr Pro Leu Ala Asp Asn 
20 25 30 

CCA AAT TCA ACA CTA GAA GAA TTA AAT TAT AAA GAA TTT TTA AGA ATG 144 
Pro Asn Ser Thr Leu Glu Glu Leu Asn Tyr Lys Glu Phe Leu Arg Met 
35 40 45 

ACT GAA GAC AGT TCT ACG GAA GTG CTA GAC AAC TCT ACA GTA AAA GAT 192 
Thr Glu Asp Ser Ser Thr Glu Val Leu Asp Asn Ser Thr Val Lys Asp 
50 55 60 

GCA GTT GGG ACA GGA ATT TCT GTT GTA GGG CAG ATT TTA GGT GTT GTA 24 0 

Ala Val Gly Thr Gly He Ser Val Val Gly Gin He Leu Gly Val Val 
65 70 75 80 

GGA GTT CCA TTT GCT GGG GCA CTC ACT TCA TTT TAT CAA TCA TTT CTT 288 
Gly Val Pro Phe Ala Gly Ala Leu Thr Ser Phe Tyr Gin Ser Phe Leu 
85 90 95 

AAC ACT ATA TGG CCA AGT GAT GCT GAC CCA TGG AAG GCT TTT ATG GCA 3 36 

Asn Thr He Trp Pro Ser Asp Ala Asp Pro Trp Lys Ala Phe Met Ala 
100 105 110 

CAA GTT GAA GTA CTG ATA GAT AAG AAA ATA GAG GAG TAT GCT AAA AGT 384 
Gin Val Glu Val Leu He Asp Lys Lys lie Glu Glu Tyr Ala Lys Ser 
115 120 125 
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AAA GCT CTT GCA GAG TTA CAG GGT CTT CAA AAT AAT TTC GAA GAT TAT 432 

Lys Ala Leu Ala Glu Leu Gin Gly Leu Gin Asn Asn Phe Glu Asp Tyr 
130 135 140 

GTT AAT GCG TTA AAT TCC TGG AAG AAA ACA CCT TTA AGT TTG CGA AGT 48 0 

Val Asn Ala Leu Asn Ser Trp Lys Lys Thr Pro Leu Ser Leu Arg Ser 
145 150 155 160 

AAA AGA AGC CAA GGT CGA ATA AGG GAA CTT TTT TCT CAA GCA GAA AGT 52 8 

Lys Arg Ser Gin Gly Arg lie Arg Glu Leu Phe Ser Gin Ala Glu Ser 
165 170 175 

CAT TTT CGT AAT TCC ATG CCG TCA TTT GCA GTT TCC AAA TTC GAA GTG 576 

His Phe Arg Asn Ser Met Pro Ser Phe Ala Val Ser Lys Phe Glu Val 
180 185 190 

CTG TTT CTA CCA ACA TAT GCA CAA GCT GCA AAT ACA CAT TTA TTG CTA 624 

Leu Phe Leu Pro Thr Tyr Ala Gin Ala Ala Asn Thr His Leu Leu Leu 
195 200 205 

TTA AAA GAT GCT CAA GTT TTT GGA GAA GAA TGG GGA TAT TCT TCA GAA 672 

Leu Lys Asp Ala Gin Val Phe Gly Glu Glu Trp Gly Tyr Ser Ser Glu 
210 215 220 

GAT GTT GCT GAA TTT TAT CAT AGA CAA TTA AAA CTT ACA CAA CAA TAC 72 0 

Asp Val Ala Glu Phe Tyr His Arg Gin Leu Lys Leu Thr Gin Gin Tyr 
225 230 235 240 

ACT GAC CAT TGT GTT AAT TGG TAT AAT GTT GGA TTA AAT GGT TTA AGA 768 

Thr Asp His Cys Val Asn Trp Tyr Asn Val Gly Leu Asn Gly Leu Arg 
245 250 255 

GGT TCA ACT TAT GAT GCA TGG GTC AAA TTT AAC CGT TTT CGC AGA GAA 816 

Gly Ser Thr Tyr Asp Ala Trp Val Lys Phe Asn Arg Phe Arg Arg Glu 
260 265 270 

ATG ACT TTA ACT GTA TTA GAT CTA ATT GTA CTT TTC CCA TTT TAT GAT 864 

Met Thr Leu Thr Val Leu Asp Leu lie Val Leu Phe Pro Phe Tyr Asp 
275 280 285 

GTT CGG TTA TAC CCA AAA GGG GTT AAA ACA GAA CTA ACA AGA GAC ATT 912 

Val Arg Leu Tyr Pro Lys Gly Val Lys Thr Glu Leu Thr Arg Asp lie 
290 295 300 

TCT ACG GAT CCA ATT TTT GCC GTT AAT ACT CTG TGG GAA TAC GGA CCA 96 0 

Ser Thr Asp Pro lie Phe Ala Val Asn Thr Leu Trp Glu Tyr Gly Pro 
305 310 315 320 

ACT TTT TTG AGT ATA GAA AAC TCT ATT CGA AAA CCT CAT TTA TTT GAT 1008 

Thr Phe Leu Ser lie Glu Asn Ser lie Arg Lys Pro His Leu Phe Asp 
325 330 335 
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TAT TTA CAG GGG ATT GAA TTT CAT ACG CGT CTT CGA CCT GGT TAC TTT 1056 
Tyr Leu Gin Gly lie Glu Phe His Thr Arg Leu Arg Pro Gly Tyr Phe 
340 345 350 

GGG AAA GAT TCT TTC AAT TAT TGG TCT GGT AAT TAT GCA GAA ACT AGA 1104 
Gly Lys Asp Ser Phe Asn Tyr Trp Ser Gly Asn Tyr Ala Glu Thr Arg 
355 360 365 

CCT AGT ATA GGA TCT AGT AAG ACA ATT ACT TCC CCA TTT TAT GGA GAT 1152 
Pro Ser lie Gly Ser Ser Lys Thr lie Thr Ser Pro Phe Tyr Gly Asp 
370 375 380 

AAA TCT ACT GAA CCT GTA CAA AAG CTA AGC TTT GAT GGA CAA AAA GTT 12 00 

Lys Ser Thr Glu Pro Val Gin Lys Leu Ser Phe Asp Gly Gin Lys Val 
385 390 395 400 

TAT CGA ACT ATA GCT AAT ACA GAC GTA GCG GCT TGG CCG AAT GGT AAG 124 8 

Tyr Arg Thr lie Ala Asn Thr Asp Val Ala Ala Trp Pro Asn Gly Lys 
405 .410 415 

GTA TAT TTA GGT GTT ACG AAA GTT GAT TTT AGT CAA TAT GAT GAT CAA 12 96 

Val Tyr Leu Gly Val Thr Lys Val Asp Phe Ser Gin Tyr Asp Asp Gin 
420 425 430 

AAA AAT GAA ACT AGT ACA CAA ACA TAT GAT TCA AAA AGA AAC AAT GGC 1344 
Lys Asn Glu Thr Ser Thr Gin Thr Tyr Asp Ser Lys Arg Asn Asn Gly 
435 440 445 

CAT GTA AGT GCA CAG GAT TCT ATT GAC CAA TTA CCG CCA GAA ACA" ACA 1392 
His Val Ser Ala Gin Asp Ser lie Asp Gin Leu Pro Pro Glu Thr Thr 
450 455 460 

GAT GAA CCA CTT GAA AAA GCA TAT AGT CAT CAG CTT AAT TAC GCG GAA 1440 
Asp Glu Pro Leu Glu Lys Ala Tyr Ser His Gin Leu Asn Tyr Ala Glu 
465 470 475 480 

TGT TTC TTA ATG CAG GAC CGT CGT GGA ACA ATT CCA TTT TTT ACT TGG 1488 
Cys Phe Leu Met Gin Asp Arg Arg Gly Thr lie Pro Phe Phe Thr Trp 
485 490 495 

ACA CAT AGA AGT GTA GAC TTT TTT AAT ACA ATT GAT GCT GAA AAG ATT 1536 
Thr His Arg Ser Val Asp Phe Phe Asn Thr lie Asp Ala Glu Lys lie 
500 505 510 

ACT CAA CTT CCA GTA GTG AAA GCA TAT GCC TTG TCT TCA GGT GCT TCC 1584 
Thr Gin Leu Pro Val Val Lys Ala Tyr Ala Leu Ser Ser Gly Ala Ser 
515 520 525 

ATT ATT GAA GGT CCA GGA TTC ACA GGA GGA AAT TTA CTA TTC CTA AAA 1632 
lie lie Glu Gly Pro Gly Phe Thr Gly Gly Asn Leu Leu Phe Leu Lys 
530 535 540 

GAA TCT AGT AAT TCA ATT GCT AAA TTT AAA GTT ACA TTA AAT TCA GCA 1680 
Glu Ser Ser Asn Ser lie Ala Lys Phe Lys Val Thr Leu Asn Ser Ala 
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545 



550 



555 



560 



GCC TTG TTA CAA CGA TAT CGT GTA AG A ATA CGC TAT GCT TCT ACC ACT 172 8 

Ala Leu Leu Gin Arg Tyr Arg Val Arg lie Arg Tyr Ala Ser Thr Thr 
565 570 575 

AAC TTA CGA CTT TTT GTG CAA AAT TCA AAC AAT GAT TTT CTT GTC ATC 1776 
Asn Leu Arg Leu Phe Val Gin Asn Ser Asn Asn Asp Phe Leu Val lie 
580 585 590 

TAC ATT AAT AAA ACT ATG AAT AAA GAT GAT GAT TTA ACA TAT CAA AC A 1824 
Tyr lie Asn Lys Thr Met Asn Lys Asp Asp Asp Leu Thr Tyr Gin Thr 
595 600 605 

TTT GAT CTC GCA ACT ACT AAT TCT AAT ATG GGG TTC TCG GGT GAT AAG 18 72 

Phe Asp Leu Ala Thr Thr Asn Ser Asn Met Gly Phe Ser Gly Asp Lys 
610 615 620 

AAT GAA CTT ATA ATA GGA GCA GAA TCT TTC GTT TCT AAT GAA AAA ATC 192 0 

Asn Glu Leu lie lie Gly Ala Glu Ser Phe Val Ser Asn Glu Lys lie 
625 630 635 640 

TAT ATA GAT AAG ATA GAA TTT ATC CCA GTA CAA TTG TAA 195 9 

Tyr lie Asp Lys lie Glu Phe lie Pro Val Gin Leu 
645 650 



(2) INFORMATION FOR SEQ ID NO:62: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 652 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:62: 

Met Asn Pro Asn Asn Arg Ser Glu His Asp Thr lie Lys Val Thr Pro 
15 10 15 

Asn Ser Glu Leu Gin Thr Asn His Asn Gin Tyr Pro Leu Ala Asp Asn 
20 25 30 

Pro Asn Ser Thr Leu Glu Glu Leu Asn Tyr Lys Glu Phe Leu Arg Met 
35 40 45 

Thr Glu Asp Ser Ser Thr Glu Val Leu Asp Asn Ser Thr Val Lys Asp 
50 55 60 

Ala Val Gly Thr Gly He Ser Val Val Gly Gin He Leu Gly Val Val 
65 70 75 80 
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Gly Val Pro Phe Ala Gly Ala Leu Thr Ser Phe Tyr Gin Ser Phe Leu 
85 90 95 



Asn Thr lie Trp Pro Ser Asp Ala Asp Pro Trp Lys Ala Phe Met Ala 
100 105 110 

Gin Val Glu Val Leu lie Asp Lys Lys lie Glu Glu Tyr Ala Lys Ser 
115 120 125 

Lys Ala Leu Ala Glu Leu Gin Gly Leu Gin Asn Asn Phe Glu Asp Tyr 
130 135 140 

Val Asn Ala Leu Asn Ser Trp Lys Lys Thr Pro Leu Ser Leu Arg Ser 
145 150 155 160 

Lys Arg Ser Gin Gly Arg lie Arg Glu Leu Phe Ser Gin Ala Glu Ser 
165 170 175 

His Phe Arg Asn Ser Met Pro Ser Phe Ala Val Ser Lys Phe Glu Val 
180 185 190 

Leu Phe Leu Pro Thr Tyr Ala Gin Ala Ala Asn Thr His Leu Leu Leu 
195 200 205 

Leu Lys Asp Ala Gin Val Phe Gly Glu Glu Trp Gly Tyr Ser Ser Glu 
210 215 220 

Asp Val Ala Glu Phe Tyr His Arg Gin Leu Lys Leu Thr Gin Gin Tyr 
225 230 235 240 

Thr Asp His Cys Val Asn Trp Tyr Asn Val Gly Leu Asn Gly Leu Arg 
245 250 255 

Gly Ser Thr Tyr Asp Ala Trp Val Lys Phe Asn Arg Phe Arg Arg Glu 
260 265 270 

Met Thr Leu Thr Val Leu Asp Leu lie Val Leu Phe Pro Phe Tyr Asp 
275 280 285 

Val Arg Leu Tyr Pro Lys Gly Val Lys Thr Glu Leu Thr Arg Asp lie 
290 295 300 

Ser Thr Asp Pro lie Phe Ala Val Asn Thr Leu Trp Glu Tyr Gly Pro 
305 310 315 320 

Thr Phe Leu Ser lie Glu Asn Ser lie Arg Lys Pro His Leu Phe Asp 
325 330 335 

Tyr Leu Gin Gly lie Glu Phe His Thr Arg Leu Arg Pro Gly Tyr Phe 
340 345 350 

Gly Lys Asp Ser Phe Asn Tyr Trp Ser Gly Asn Tyr Ala Glu Thr Arg 
355 360 365 
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Pro Ser lie Gly Ser Ser Lys Thr lie Thr Ser Pro Phe Tyr Gly Asp 
370 375 380 



Lys Ser Thr Glu Pro Val Gin Lys Leu Ser Phe Asp Gly Gin Lys Val 
385 390 395 400 

Tyr Arg Thr lie Ala Asn Thr Asp Val Ala Ala Trp Pro Asn Gly Lys 
405 410 415 

Val Tyr Leu Gly Val Thr Lys Val Asp Phe Ser Gin Tyr Asp Asp Gin 
420 425 430 

Lys Asn Glu Thr Ser Thr Gin Thr Tyr Asp Ser Lys Arg Asn Asn Gly 
■ 435 440 445 

His Val Ser Ala Gin Asp Ser lie Asp Gin Leu Pro Pro Glu Thr Thr 
450 455 460 

Asp Glu Pro Leu Glu Lys Ala Tyr Ser His Gin Leu Asn Tyr Ala Glu 
465 470 475 480 

Cys Phe Leu Met Gin Asp Arg Arg Gly Thr lie Pro Phe Phe Thr Trp 
485 490 495 

Thr His Arg Ser Val Asp Phe Phe Asn Thr lie Asp Ala Glu Lys lie 
500 505 510 

Thr Gin Leu Pro Val Val Lys Ala Tyr Ala Leu Ser Ser Gly Ala Ser 
515 520 525 

lie lie Glu Gly Pro Gly Phe Thr Gly Gly Asn Leu Leu Phe Leu Lys 
530 535 540 

Glu Ser Ser Asn Ser He Ala Lys Phe Lys Val Thr Leu Asn Ser Ala 
545 550 555 560 

Ala Leu Leu Gin Arg Tyr Arg Val Arg He Arg Tyr Ala Ser Thr Thr 
565 570 575 

Asn Leu Arg Leu Phe Val Gin Asn Ser Asn Asn Asp Phe Leu Val He 
580 585 590 

Tyr He Asn Lys Thr Met Asn Lys Asp Asp Asp Leu Thr Tyr Gin Thr 
595 600 605 

Phe Asp Leu Ala Thr Thr Asn Ser Asn Met Gly Phe Ser Gly Asp Lys 
610 615 620 

Asn Glu Leu He He Gly Ala Glu Ser Phe Val Ser Asn Glu Lys He 
625 630 635 640 

Tyr He Asp Lys He Glu Phe He Pro Val Gin Leu 
645 650 
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(2) INFORMATION FOR SEQ ID NO: 63: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1959 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ix) FEATURE: 

(A) NAME /KEY : CDS 

(B) LOCATION: 1..1956 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 63: 

ATG AAT CCA AAC AAT CGA AGT GAA CAT GAT ACG ATA AAG GTT ACA CCT 4 8 

Met Asn Pro Asn Asn Arg Ser Glu His Asp Thr lie Lys Val Thr Pro 
15 10 15 

AAC AGT GAA TTG CAA ACT AAC CAT AAT CAA TAT CCT TTA GCT GAC AAT 96 
Asn Ser Glu Leu Gin Thr Asn His Asn Gin Tyr Pro Leu Ala Asp Asn 
20 25 30 

CCA AAT TCA ACA CTA GAA GAA TTA AAT TAT AAA GAA TTT TTA AGA ATG 144 
Pro Asn Ser Thr Leu Glu Glu Leu Asn Tyr Lys Glu Phe Leu Arg Met 
35 40 45 

ACT GAA GAC AGT TCT ACG GAA GTG CTA GAC AAC TCT ACA GTA AAA GAT 192 
Thr Glu Asp Ser Ser Thr Glu Val Leu Asp Asn Ser Thr Val Lys Asp 
50 55 60 

GCA GTT GGG ACA GGA ATT TCT GTT GTA GGG CAG ATT TTA GGT GTT GTA 240 
Ala Val Gly Thr Gly He Ser Val Val Gly Gin He Leu Gly Val Val 
65 70 75 80 

GGA GTT CCA TTT GCT GGG GCA CTC ACT TCA TTT TAT CAA TCA TTT CTT 2 88 

Gly Val Pro Phe Ala Gly Ala Leu Thr Ser Phe Tyr Gin Ser Phe Leu 
85 90 95 

AAC ACT ATA TGG CCA AGT GAT GCT GAC CCA TGG AAG GCT TTT ATG GCA 3 36 

Asn Thr He Trp Pro Ser Asp Ala Asp Pro Trp Lys Ala Phe Met Ala 
100 105 110 

CAA GTT GAA GTA CTG ATA GAT AAG AAA ATA GAG GAG TAT GCT AAA AGT 384 
Gin Val Glu Val Leu He Asp Lys Lys He Glu Glu Tyr Ala Lys Ser 
115 120 125 

AAA GCT CTT GCA GAG TTA CAG GGT CTT CAA AAT AAT TTC GAA GAT TAT 432 
Lys Ala Leu Ala Glu Leu Gin Gly Leu Gin Asn Asn Phe Glu Asp Tyr 
130 135 140 

GTT AAT GCG TTA AAT TCC TGG AAG AAA ACA CCT TTA AGT TTG CGA AGT 480 
Val Asn Ala Leu Asn Ser Trp Lys Lys Thr Pro Leu Ser Leu Arg Ser 
145 150 155 160 
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AAA AGA AGC CAA GAT CGA ATA AGG GAA CTT TTT TCT CAA GCA GAA AGT 528 
Lys Arg Ser Gin Asp Arg lie Arg Glu Leu Phe Ser Gin Ala Glu Ser 
165 170 175 

CAT TTT CGT AAT TCC ATG CCG TCA TTT GCA GTT TCC AAA TTC GAA GTG 576 
His Phe Arg Asn Ser Met Pro Ser Phe Ala Val Ser Lys Phe Glu Val 
180 185 190 

CTG TTT CTA CCA ACA TAT GCA CAA GCT GCA AAT ACA CAT TTA TTG CTA 6 24 

Leu Phe Leu Pro Thr Tyr Ala Gin Ala Ala Asn Thr His Leu Leu Leu 
195 200 205 

TTA AAA GAT GCT CAA GTT TTT GGA GAA GAA TGG GGA TAT TCT TCA GAA 6 72 

Leu Lys Asp Ala Gin Val Phe Gly Glu Glu Trp Gly Tyr Ser Ser Glu 
210 215 220 

GAT GTT GCT GAA TTT TAT CAT AGA CAA TTA AAA CTT ACA CAA CAA TAC 720 
Asp Val Ala Glu Phe Tyr His Arg Gin Leu Lys Leu Thr Gin Gin Tyr 
225 230 235 240 

ACT GAC CAT TGT GTT AAT TGG TAT AAT GTT GGA TTA AAT GGT TTA AGA 768 
Thr Asp His Cys Val Asn Trp Tyr Asn Val Gly Leu Asn Gly Leu Arg 
245 250 255 

GGT TCA ACT TAT GAT GCA TGG GTC AAA TTT AAC CGT TTT CGC AGA GAA 816 
Gly Ser Thr Tyr Asp Ala Trp Val Lys Phe Asn Arg Phe Arg Arg Glu 
260 265 270 

ATG ACT TTA ACT GTA TTA GAT CTA ATT GTA CTT TTC CCA TTT TAT GAT 864 
Met Thr Leu Thr Val Leu Asp Leu lie Val Leu Phe Pro Phe Tyr Asp 
275 280 285 

GTT CGG TTA TAC CCA AAA GGG GTT AAA ACA GAA CTA ACA AGA GAC ATT 912 
Val Arg Leu Tyr Pro Lys Gly Val Lys Thr Glu Leu Thr Arg Asp lie 
290 295 300 

TTT ACG GAT CCA ATT TTT TCA CTT AAT ACT CTT CAG GAG TAT GGA CCA 960 
Phe Thr Asp Pro lie Phe Ser Leu Asn Thr Leu Gin Glu Tyr Gly Pro 
305 310 315 320 

ACT TTT TTG AGT ATA GAA AAC TCT ATT CGA AAA CCT CAT TTA TTT GAT 1008 
Thr Phe Leu Ser lie Glu Asn Ser lie Arg Lys Pro His Leu Phe Asp 
325 330 335 

TAT TTA CAG GGG ATT GAA TTT CAT ACG CGT CTT CGA CCT GGT TAC TTT 1056 
Tyr Leu Gin Gly He Glu Phe His Thr Arg Leu Arg Pro Gly Tyr Phe 
340 345 350 

GGG AAA GAT TCT TTC AAT TAT TGG TCT GGT AAT TAT GTA GAA ACT AGA 1104 
Gly Lys Asp Ser Phe Asn Tyr Trp Ser Gly Asn Tyr Val Glu Thr Arg 
355 360 365 
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CCT AGT ATA GGA TCT AGT AAG ACA ATT ACT TCC CCA TTT TAT GGA GAT 1152 
Pro Ser He Gly Ser Ser Lys Thr He Thr Ser Pro Phe Tyr Gly Asp 
370 375 380 

AAA TCT ACT GAA CCT GTA CAA AAG CTA AGC TTT GAT GGA CAA AAA GTT 1200 
Lys Ser Thr Glu Pro Val Gin Lys Leu Ser Phe Asp Gly Gin Lys Val 
385 390 395 400 

TAT CGA ACT ATA GCT AAT ACA GAC GTA GCG GCT TGG CCG AAT GGT AAG 124 8 

Tyr Arg Thr He Ala Asn Thr Asp Val Ala Ala Trp Pro Asn Gly Lys 

405 410 415 

GTA TAT TTA GGT GTT ACG AAA GTT GAT TTT AGT CAA TAT GAT GAT CAA 1296 

Val Tyr Leu Gly Val Thr Lys Val Asp Phe Ser Gin Tyr Asp Asp Gin 
420 425 430 

AAA AAT GAA ACT AGT ACA CAA ACA TAT GAT TCA AAA AGA AAC AAT GGC 1344 

Lys Asn Glu Thr Ser Thr Gin Thr Tyr Asp Ser Lys Arg Asn Asn Gly 
435 440 445 

CAT GTA AGT GCA CAG GAT TCT ATT GAC CAA TTA CCG CCA GAA ACA ACA 13 92 

His Val Ser Ala Gin Asp Ser He Asp Gin Leu Pro Pro Glu Thr Thr 
450 455 460 

GAT GAA CCA CTT GAA AAA GCA TAT AGT CAT CAG CTT AAT TAC GCG GAA 1440 

Asp Glu Pro Leu Glu Lys Ala Tyr Ser His Gin Leu Asn Tyr Ala Glu 
465 470 475 480 

TGT TTC TTA ATG CAG GAC CGT CGT GGA ACA ATT CCA TTT TTT ACT TGG 1488 

Cys Phe Leu Met Gin Asp Arg Arg Gly Thr lie Pro Phe Phe Thr Trp 

485 490 495 

ACA CAT AGA AGT GTA GAC TTT TTT AAT ACA ATT GAT GCT GAA AAG ATT 1536 

Thr His Arg Ser Val Asp Phe Phe Asn Thr He Asp Ala Glu Lys He 
500 505 510 

ACT CAA CTT CCA GTA GTG AAA GCA TAT GCC TTG TCT TCA GGT GCT TCC 1584 

Thr Gin Leu Pro Val Val Lys Ala Tyr Ala Leu Ser Ser Gly Ala Ser 
515 520 525 

ATT ATT GAA GGT CCA GGA TTC ACA GGA GGA AAT TTA CTA TTC CTA AAA 1632 

He He Glu Gly Pro Gly Phe Thr Gly Gly Asn Leu Leu Phe Leu Lys 
530 535 540 

GAA TCT AGT AAT TCA ATT GCT AAA TTT AAA GTT ACA TTA AAT TCA GCA 16 8 0 

Glu Ser Ser Asn Ser He Ala Lys Phe Lys Val Thr Leu Asn Ser Ala 
545 550 555 560 

GCC TTG TTA CAA CGA TAT CGT GTA AGA ATA CGC TAT GCT TCT ACC ACT 1728 

Ala Leu Leu Gin Arg Tyr Arg Val Arg He Arg Tyr Ala Ser Thr Thr 

565 570 575 
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AAC TTA CGA CTT TTT GTG CAA AAT TCA AAC AAT GAT TTT CTT GTC ATC 17 76 

Asn Leu Arg Leu Phe Val Gin Asn Ser Asn Asn Asp Phe Leu Val lie 
580 585 590 

TAC ATT AAT AAA ACT ATG AAT AAA GAT GAT GAT TTA ACA TAT CAA ACA 1824 

Tyr lie Asn Lys Thr Met Asn Lys Asp Asp Asp Leu Thr Tyr Gin Thr 
595 600 605 

TTT GAT CTC GCA ACT ACT AAT TCT AAT ATG GGG TTC TCG GGT GAT AAG 18 72 

Phe Asp Leu Ala Thr Thr Asn Ser Asn Met Gly Phe Ser Gly Asp Lys 
610 615 620 

AAT GAA CTT ATA ATA GGA GCA GAA TCT TTC GTT TCT AAT GAA AAA ATC 192 0 

Asn Glu Leu lie lie Gly Ala Glu Ser Phe Val Ser Asn Glu Lys lie 

625 630 635 640 

TAT ATA GAT AAG ATA GAA TTT ATC CCA GTA CAA TTG TAA 195 9 

Tyr lie Asp Lys lie Glu Phe lie Pro Val Gin Leu 
645 650 



(2) INFORMATION FOR SEQ ID NO: 64: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6 52 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 64 : 

Met Asn Pro Asn Asn Arg Ser Glu His Asp Thr lie Lys Val Thr Pro 
15 10 15 

Asn Ser Glu Leu Gin Thr Asn His Asn Gin Tyr Pro Leu Ala Asp Asn 
20 25 30 

Pro Asn Ser Thr Leu Glu Glu Leu Asn Tyr Lys Glu Phe Leu Arg Met 
35 40 45 

Thr Glu Asp Ser Ser Thr Glu Val Leu Asp Asn Ser Thr Val Lys Asp 
50 55 60 

Ala Val Gly Thr Gly He Ser Val Val Gly Gin He Leu Gly Val Val 
65 70 75 80 

Gly Val Pro Phe Ala Gly Ala Leu Thr Ser Phe Tyr Gin Ser Phe Leu 
85 90 95 

Asn Thr He Trp Pro Ser Asp Ala Asp Pro Trp Lys Ala Phe Met Ala 
100 105 110 
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Gin Val Glu Val Leu lie Asp Lys Lys lie Glu Glu Tyr Ala Lys Ser 
115 120 125 



Lys Ala Leu Ala Glu Leu Gin Gly Leu Gin Asn Asn Phe Glu Asp Tyr 
130 135 140 

Val Asn Ala Leu Asn Ser Trp Lys Lys Thr Pro Leu Ser Leu Arg Ser 
145 150 155 160 

Lys Arg Ser Gin Asp Arg lie Arg Glu Leu Phe Ser Gin Ala Glu Ser 
165 170 175 

His Phe Arg Asn Ser Met Pro Ser Phe Ala Val Ser Lys Phe Glu Val 
180 185 190 

Leu Phe Leu Pro Thr Tyr Ala Gin Ala Ala Asn Thr His Leu Leu Leu 
195 200 205 

Leu Lys Asp Ala Gin Val Phe Gly Glu Glu Trp Gly Tyr Ser Ser Glu 
210 215 220 

Asp Val Ala Glu Phe Tyr His Arg Gin Leu Lys Leu Thr Gin Gin Tyr 
225 230 235 240 

Thr Asp His Cys Val Asn Trp Tyr Asn Val Gly Leu Asn Gly Leu Arg 
245 250 255 

Gly Ser Thr Tyr Asp Ala Trp Val Lys Phe Asn Arg Phe Arg Arg Glu 
260 265 270 

Met Thr Leu Thr Val Leu Asp Leu lie Val Leu Phe Pro Phe Tyr Asp 
275 280 285 

Val Arg Leu Tyr Pro Lys Gly Val Lys Thr Glu Leu Thr Arg Asp lie 
290 295 300 

Phe Thr Asp Pro lie Phe Ser Leu Asn Thr Leu Gin Glu Tyr Gly Pro 
305 310 315 320 

Thr Phe Leu Ser lie Glu Asn Ser He Arg Lys Pro His Leu Phe Asp 
325 330 335 

Tyr Leu Gin Gly He Glu Phe His Thr Arg Leu Arg Pro Gly Tyr Phe 
340 345 350 

Gly Lys Asp Ser Phe Asn Tyr Trp Ser Gly Asn Tyr Val Glu Thr Arg 
355 360 365 

Pro Ser He Gly Ser Ser Lys Thr He Thr Ser Pro Phe Tyr Gly Asp 
370 375 380 

Lys Ser Thr Glu Pro Val Gin Lys Leu Ser Phe Asp Gly Gin Lys Val 
385 390 395 400 
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Tyr Arg Thr lie Ala Asn Thr Asp Val Ala Ala Trp Pro Asn Gly Lys 
405 410 415 



Val Tyr Leu Gly Val Thr Lys Val Asp Phe Ser Gin Tyr Asp Asp Gin 
420 425 430 

Lys Asn Glu Thr Ser Thr Gin Thr Tyr Asp Ser Lys Arg Asn Asn Gly 
435 440 445 

His Val Ser Ala Gin Asp Ser lie Asp Gin Leu Pro Pro Glu Thr Thr 
450 ■ 455 460 

Asp Glu Pro Leu Glu Lys Ala Tyr Ser His Gin Leu Asn Tyr Ala Glu 
465 470 475 480 

Cys Phe Leu Met Gin Asp Arg Arg Gly Thr lie Pro Phe Phe Thr Trp 
485 490 - 495 

Thr His Arg Ser Val Asp Phe Phe Asn Thr lie Asp Ala Glu Lys lie 
500 505 510 

Thr Gin Leu Pro Val Val Lys Ala Tyr Ala Leu Ser Ser Gly Ala Ser 
515 520 525 

lie lie Glu Gly Pro Gly Phe Thr Gly Gly Asn Leu Leu Phe Leu Lys 
530 535 540 

Glu Ser Ser Asn Ser lie Ala Lys Phe Lys Val Thr Leu Asn Ser Ala 
545 550 555 560 

Ala Leu Leu Gin Arg Tyr Arg Val Arg lie Arg Tyr Ala Ser Thr Thr 
565 570 575 

Asn Leu Arg Leu Phe Val Gin Asn Ser Asn Asn Asp Phe Leu Val He 
580 585 590 

Tyr He Asn Lys Thr Met Asn Lys Asp Asp Asp Leu Thr Tyr Gin Thr 
595 600 605 

Phe Asp Leu Ala Thr Thr Asn Ser Asn Met Gly Phe Ser Gly Asp Lys 
610 615 620 

Asn Glu Leu He He Gly Ala Glu Ser Phe Val Ser Asn Glu Lys He 
625 630 635 640 

Tyr He Asp Lys lie Glu Phe He Pro Val Gin Leu 
645 650 



(2) INFORMATION FOR SEQ ID NO: 65: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1959 base pairs 

(B) TYPE: nucleic acid 
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(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(ix) FEATURE: 

(A) NAME /KEY : CDS 

(B) LOCATION: 1 . . 1956 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 65: 

ATG AAT CCA AAC AAT CGA AGT GAA CAT GAT ACG ATA AAG GTT ACA CCT 
Met Asn Pro Asn Asn Arg Ser Glu His Asp Thr lie Lys Val Thr Pro 
15 10 15 

AAC AGT GAA TTG CAA ACT AAC CAT AAT CAA TAT CCT TTA GCT GAC AAT 
Asn Ser Glu Leu Gin Thr Asn His Asn Gin Tyr Pro Leu Ala Asp Asn 
20 25 30 

CCA AAT TCA ACA CTA GAA GAA TTA AAT TAT AAA GAA TTT TTA AGA ATG 
Pro Asn Ser Thr Leu Glu Glu Leu Asn Tyr Lys Glu Phe Leu Arg Met 
35 40 45 

ACT GAA GAC AGT TCT ACG GAA GTG CTA GAC AAC TCT ACA GTA AAA GAT 
Thr Glu Asp Ser Ser Thr Glu Val Leu Asp Asn Ser Thr Val Lys Asp 
50 55 60 

GCA GTT GGG ACA GGA ATT TCT GTT GTA GGG CAG ATT TTA GGT GTT GTA 
Ala Val Gly Thr Gly He Ser Val Val Gly Gin He Leu Gly Val Val 
65 70 75 80 

GGA GTT CCA TTT GCT GGG GCA CTC ACT TCA TTT TAT CAA TCA TTT CTT 
Gly Val Pro Phe Ala Gly Ala Leu Thr Ser Phe Tyr Gin Ser Phe Leu 
85 90 95 

AAC ACT ATA TGG CCA AGT GAT GCT GAC CCA TGG AAG GCT TTT ATG GCA 
Asn Thr He Trp Pro Ser Asp Ala Asp Pro Trp Lys Ala Phe Met Ala 
100 105 110 

CAA GTT GAA GTA CTG ATA GAT AAG AAA ATA GAG GAG TAT GCT AAA AGT 
Gin Val Glu Val Leu He Asp Lys Lys He Glu Glu Tyr Ala Lys Ser 
115 120 125 

AAA GCT CTT GCA GAG TTA CAG GGT CTT CAA AAT AAT TTC GAA GAT TAT 
Lys Ala Leu Ala Glu Leu Gin Gly Leu Gin Asn Asn Phe Glu Asp Tyr 
130 135 140 

GTT AAT GCG TTA AAT TCC TGG AAG AAA ACA CCT TTA AGT TTG CGA AGT 
Val Asn Ala Leu Asn Ser Trp Lys Lys Thr Pro Leu Ser Leu Arg Ser 
145 150 155 160 

AAA AGA AGC CAA GGT CGA ATA AGG GAA CTT TTT TCT CAA GCA GAA AGT 
Lys Arg Ser Gin Gly Arg He Arg Glu Leu Phe Ser Gin Ala Glu Ser 
165 170 175 
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CAT TTT CGT AAT TCC ATG CCG TCA TTT GCA GTT TCC AAA TTC GAA GTG 5 76 

His Phe Arg Asn Ser Met Pro Ser Phe Ala Val Ser Lys Phe Glu Val 
180 185 190 

CTG TTT CTA CCA ACA TAT GCA CAA GCT GCA AAT ACA CAT TTA TTG CTA 62 4 

Leu Phe Leu Pro Thr Tyr Ala Gin Ala Ala Asn Thr His Leu Leu Leu 
195 200 205 

TTA AAA GAT GCT CAA GTT TTT GGA GAA GAA TGG GGA TAT TCT TCA GAA 672 
Leu Lys Asp Ala Gin Val Phe Gly Glu Glu Trp Gly Tyr Ser Ser Glu 
210 215 220 

GAT GTT GCT GAA TTT TAT CAT AGA CAA TTA AAA CTT ACA CAA CAA TAC 72 0 

Asp Val Ala Glu Phe Tyr His Arg Gin Leu Lys Leu Thr Gin Gin Tyr 
225 230 235 240 

ACT GAC CAT TGT GTT AAT TGG TAT AAT GTT GGA TTA AAT GGT TTA AGA 768 
Thr Asp His Cys Val Asn Trp Tyr Asn Val Gly Leu Asn Gly Leu Arg 
245 250 255 

GGT TCA ACT TAT GAT GCA TGG GTC AAA TTT AAC CGT TTT CGC AGA GAA 816 
Gly Ser Thr Tyr Asp Ala Trp Val Lys Phe Asn Arg Phe Arg Airg Glu 
260 265 270 

ATG ACT TTA ACT GTA TTA GAT CTA ATT GTA CTT TTC CCA TTT TAT GAT 864 
Met Thr Leu Thr Val Leu Asp Leu lie Val Leu Phe Pro Phe Tyr Asp 
275 280 285 

ATT CGG TTA TAC TCA AAA GGG GTT AAA ACA GAA CTA ACA AGA GAC ATT 912 
lie Arg Leu Tyr Ser Lys Gly Val Lys Thr Glu Leu Thr Arg Asp lie 
290 295 300 

TTT ACG GAT CCA ATT TTT TTA CTT AAT ACT CTT CAG GAG TAT GGA CCA 960 
Phe Thr Asp Pro lie Phe Leu Leu Asn Thr Leu Gin Glu Tyr Gly Pro 
305 310 315 320 

ACT TTT TTG AGT ATA GAA AAC TCT ATT CGA AAA CCT CAT TTA TTT GAT 1008 
Thr Phe Leu Ser lie Glu Asn Ser lie Arg Lys Pro His Leu Phe Asp 
325 330 335 

TAT TTA CAG GGG ATT GAA TTT CAT ACG CGT CTT CAA CCT GGT TAC TTT 1056 
Tyr Leu Gin Gly lie Glu Phe His Thr Arg Leu Gin Pro Gly Tyr Phe 
340 345 350 

GGG AAA GAT TCT TTC AAT TAT TGG TCT GGT AAT TAT GTA GAA ACT AGA 1104 
Gly Lys Asp Ser Phe Asn Tyr Trp Ser Gly Asn Tyr Val Glu Thr Arg 
355 360 365 

CCT AGT ATA GGA TCT AGT AAG ACA ATT ACT TCC CCA TTT TAT GGA GAT 1152 
Pro Ser lie Gly Ser Ser Lys Thr lie Thr Ser Pro Phe Tyr Gly Asp 
370 375 380 
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AAA TCT ACT GAA CCT GTA CAA AAG CTA AGC TTT GAT GGA CAA AAA GTT 1200 

Lys Ser Thr Glu Pro Val Gin Lys Leu Ser Phe Asp Gly Gin Lys Val 
385 390 395 400 

TAT CGA ACT ATA GCT AAT ACA GAC GTA GCG GCT TGG CCG AAT GGT AAG 1248 

Tyr Arg Thr lie Ala Asn Thr Asp Val Ala Ala Trp Pro Asn Gly Lys 
405 410 415 

GTA TAT TTA GGT GTT ACG AAA GTT GAT TTT AGT CAA TAT GAT GAT CAA 12 96 

Val Tyr Leu Gly Val Thr Lys Val Asp Phe Ser Gin Tyr Asp Asp Gin 
420 425 430 

AAA AAT GAA ACT AGT ACA CAA ACA TAT GAT TCA AAA AGA AAC AAT GGC 1344 

Lys Asn Glu Thr Ser Thr Gin Thr Tyr Asp Ser Lys Arg Asn Asn Gly 
435 440 445 

CAT GTA AGT GCA CAG GAT TCT ATT GAC CAA TTA CCG CCA GAA ACA ACA 1392 

His Val Ser Ala Gin Asp Ser lie Asp Gin Leu Pro Pro Glu Thr Thr 
450 455 460 

GAT GAA CCA CTT GAA AAA GCA TAT AGT CAT CAG CTT AAT TAC GCG GAA 1440 

Asp Glu Pro Leu Glu Lys Ala Tyr Ser His Gin Leu Asn Tyr Ala Glu 
465 470 475 480 

TGT TTC TTA ATG CAG GAC CGT CGT GGA ACA ATT CCA TTT TTT ACT TGG 14 88 

Cys Phe Leu Met Gin Asp Arg Arg Gly Thr lie Pro Phe Phe Thr Trp 
485 490 495 

ACA CAT AGA AGT GTA GAC TTT TTT AAT ACA ATT GAT GCT GAA AAG ATT 1536 

Thr His Arg Ser Val Asp Phe Phe Asn Thr lie Asp Ala Glu Lys lie 
500 505 510 

ACT CAA CTT CCA GTA GTG AAA GCA TAT GCC TTG TCT TCA GGT GCT TCC 1584 

Thr Gin Leu Pro Val Val Lys Ala Tyr Ala Leu Ser Ser Gly Ala Ser 
515 520 525 

ATT ATT GAA GGT CCA GGA TTC ACA GGA GGA AAT TTA CTA TTC CTA AAA 1632 

lie lie Glu Gly Pro Gly Phe Thr Gly Gly Asn Leu Leu Phe Leu Lys 
530 535 540 

GAA TCT AGT AAT TCA ATT GCT AAA TTT AAA GTT ACA TTA AAT TCA GCA 1680 

Glu Ser Ser Asn Ser lie Ala Lys Phe Lys Val Thr Leu Asn Ser Ala 
545 550 555 560 

GCC TTG TTA CAA CGA TAT CGT GTA AGA ATA CGC TAT GCT TCT ACC ACT 1728 

Ala Leu Leu Gin Arg Tyr Arg Val Arg lie Arg Tyr Ala Ser Thr Thr 
565 570 575 

AAC TTA CGA CTT TTT GTG CAA AAT TCA AAC AAT GAT TTT CTT GTC ATC 1776 

Asn Leu Arg Leu Phe Val Gin Asn Ser Asn Asn Asp Phe Leu Val lie 
580 585 590 
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TAC ATT AAT AAA ACT ATG AAT AAA GAT GAT GAT TTA ACA TAT CAA ACA 
Tyr He Asn Lys Thr Met Asn Lys Asp Asp Asp Leu Thr Tyr Gin Thr 
595 600 605 



1824 



TTT GAT CTC GCA ACT ACT AAT TCT AAT ATG GGG TTC TCG GGT GAT AAG 18 72 

Phe Asp Leu Ala Thr Thr Asn Ser Asn Met Gly Phe Ser Gly Asp Lys 
610 615 620 



AAT GAA CTT ATA ATA GGA GCA GAA TCT TTC GTT TCT AAT GAA AAA ATC 1920 
Asn Glu Leu He He Gly Ala Glu Ser Phe Val Ser Asn Glu Lys He 
625 630 635 640 

TAT ATA GAT AAG ATA GAA TTT ATC CCA GTA CAA TTG TAA 1959 
Tyr He Asp Lys He Glu Phe He Pro Val Gin Leu 
645 650 



(2) INFORMATION FOR SEQ ID NO: 66: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6 52 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 66: 

Met Asn Pro Asn Asn Arg Ser Glu His Asp Thr He Lys Val Thr Pro 
15 10 15 



Asn Ser Glu Leu Gin Thr Asn His 
20 

Pro Asn Ser Thr Leu Glu Glu Leu 

35 40 



Asn Gin Tyr Pro Leu Ala Asp Asn 
25 30 

Asn Tyr Lys Glu Phe Leu Arg Met 
45 



Thr Glu Asp Ser Ser Thr Glu Val 
50 55 

Ala Val Gly Thr Gly He Ser Val 
65 70 

Gly Val Pro Phe Ala Gly Ala Leu 
85 

Asn Thr lie Trp Pro Ser Asp Ala 
100 



Leu Asp Asn Ser Thr Val Lys Asp 
60 

Val Gly Gin He Leu Gly Val Val 
75 80 

Thr Ser Phe Tyr Gin Ser Phe Leu 
90 95 

Asp Pro Trp Lys Ala Phe Met Ala 
105 110 



Gin Val Glu Val Leu lie Asp Lys Lys He Glu Glu Tyr Ala Lys Ser 
115 120 125 

Lys Ala Leu Ala Glu Leu Gin Gly Leu Gin Asn Asn Phe Glu Asp Tyr 
130 135 140 

-401- 
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Val Asn Ala Leu Asn Ser Trp Lys Lys Thr Pro Leu Ser Leu Arg Ser 
145 150 155 160 



Lys Arg Ser Gin Gly Arg lie Arg Glu Leu Phe Ser Gin Ala Glu Ser 
165 170 175 

His Phe Arg Asn Ser Met Pro Ser Phe Ala Val Ser Lys Phe Glu Val 
180 185 190 

Leu Phe Leu Pro Thr Tyr Ala Gin Ala Ala Asn Thr His Leu Leu Leu 
195 200 205 

Leu Lys Asp Ala Gin Val Phe Gly Glu Glu Trp Gly Tyr Ser Ser Glu 
210 215 220 

Asp Val Ala Glu Phe Tyr His Arg Gin Leu Lys Leu Thr Gin Gin Tyr 
225 230 235 240 

Thr Asp His Cys Val Asn Trp Tyr Asn Val Gly Leu Asn Gly Leu Arg 
245 250 255 

Gly Ser Thr Tyr Asp Ala Trp Val Lys Phe Asn Arg Phe Arg Arg Glu 
260 .265 270 

Met Thr Leu Thr Val Leu Asp Leu lie Val Leu Phe Pro Phe Tyr Asp 
275 280 285 

lie Arg Leu Tyr Ser Lys Gly Val Lys Thr Glu Leu Thr Arg Asp lie 
290 295 300 

Phe Thr Asp Pro lie Phe Leu Leu Asn Thr Leu Gin Glu Tyr Gly Pro 
305 310 315 320 

Thr Phe Leu Ser lie Glu Asn Ser lie Arg Lys Pro His Leu Phe Asp 
325 330 335 

Tyr Leu Gin Gly lie Glu Phe His Thr Arg Leu Gin Pro Gly Tyr Phe 
340 345 350 

Gly Lys Asp Ser Phe Asn Tyr Trp Ser Gly Asn Tyr Val Glu Thr Arg 
355 360 365 

Pro Ser lie Gly Ser Ser Lys Thr lie Thr Ser Pro Phe Tyr Gly Asp 
370 375 380 

Lys Ser Thr Glu Pro Val Gin Lys Leu Ser Phe Asp Gly Gin Lys Val 
385 390 395 400 

Tyr Arg Thr lie Ala Asn Thr Asp Val Ala Ala Trp Pro Asn Gly Lys 
405 410 415 

Val Tyr Leu Gly Val Thr Lys Val Asp Phe Ser Gin Tyr Asp Asp Gin 
420 425 430 
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Lys Asn Glu Thr Ser Thr Gin Thr Tyr Asp Ser Lys Arg Asn Asn Gly 
435 440 445 



His Val Ser Ala Gin Asp Ser lie Asp Gin Leu Pro Pro Glu Thr Thr 
450 455 460 

Asp Glu Pro Leu Glu Lys Ala Tyr Ser His Gin Leu Asn Tyr Ala Glu 
465 470 475 480 

Cys Phe Leu Met Gin Asp Arg Arg Gly Thr lie Pro Phe Phe Thr Trp 
485 490 495 

Thr His Arg Ser Val Asp Phe Phe Asn Thr lie Asp Ala Glu Lys lie 
500 505 510 

Thr Gin Leu Pro Val Val Lys Ala Tyr Ala Leu Ser Ser Gly Ala Ser 
515 520 525 

lie lie Glu Gly Pro Gly Phe Thr Gly Gly Asn Leu Leu Phe Leu Lys 
530 535 540 

Glu Ser Ser Asn Ser lie Ala Lys Phe Lys Val Thr Leu Asn Ser Ala 
545 550 555 560 

Ala Leu Leu Gin Arg Tyr Arg Val Arg lie Arg Tyr Ala Ser Thr Thr 
565 570 575 

Asn Leu Arg Leu Phe Val Gin Asn Ser Asn Asn Asp Phe Leu Val He 
580 585 590 

Tyr He Asn Lys Thr Met Asn Lys Asp Asp Asp Leu Thr Tyr Gin Thr 
595 600 605 

Phe Asp Leu Ala Thr Thr Asn Ser Asn Met Gly Phe Ser Gly Asp Lys 
610 615 620 

Asn Glu Leu He He Gly Ala Glu Ser Phe Val Ser Asn Glu Lys He 
625 630 635 640 

Tyr He Asp Lys He Glu Phe He Pro Val Gin Leu 
645 650 



(2) INFORMATION FOR SEQ ID NO: 67: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1959 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 
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(ix) FEATURE: 

(A) NAME /KEY : CDS 

(B) LOCATION: 1 . - 1956 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 67: 

ATG AAT CCA AAC AAT CGA AGT GAA CAT GAT ACG ATA AAG GTT ACA CCT 4 8 

Met Asn Pro Asn Asn Arg Ser Glu His Asp Thr lie Lys Val Thr Pro 
15 10 15 

AAC AGT GAA TTG CAA ACT AAC CAT AAT CAA TAT CCT TTA GCT GAC AAT 96 
Asn Ser Glu Leu Gin Thr Asn His Asn Gin Tyr Pro Leu Ala Asp Asn 
20 25 30 

CCA AAT TCA ACA CTA GAA GAA TTA AAT TAT AAA GAA TTT TTA AGA ATG 14 4 

Pro Asn Ser Thr Leu Glu Glu Leu Asn Tyr Lys Glu Phe Leu Arg Met 
35 40 45 

ACT GAA GAC AGT TCT ACG GAA GTG CTA GAC AAC TCT ACA GTA AAA GAT 192 
Thr Glu Asp Ser Ser Thr Glu Val Leu Asp Asn Ser Thr Val Lys Asp 
50 55 60 

GCA GTT GGG ACA GGA ATT TCT GTT GTA GGG CAG ATT TTA GGT GTT GTA 240 
Ala Val Gly Thr Gly He Ser Val Val Gly Gin He Leu Gly Val Val 
65 70 75 80 

GGA GTT CCA TTT GCT GGG GCA CTC ACT TCA TTT TAT CAA TCA TTT CTT 288 
Gly Val Pro Phe Ala Gly Ala Leu Thr Ser Phe Tyr Gin Ser Phe Leu 
85 90 95 

AAC ACT ATA TGG CCA AGT GAT GCT GAC CCA TGG AAG GCT TTT ATG GCA 336 
Asn Thr He Trp Pro Ser Asp Ala Asp Pro Trp Lys Ala Phe Met Ala 
100 105 110 

CAA GTT GAA GTA CTG ATA GAT AAG AAA ATA GAG GAG TAT GCT AAA AGT 384 
Gin Val Glu Val Leu He Asp Lys Lys He Glu Glu Tyr Ala Lys Ser 
115 120 125 

AAA GCT CTT GCA GAG TTA CAG GGT CTT CAA AAT AAT TTC GAA GAT TAT 4 32 

Lys Ala Leu Ala Glu Leu Gin Gly Leu Gin Asn Asn Phe Glu Asp Tyr 
130 135 140 

GTT AAT GCG TTA AAT TCC TGG AAG AAA ACA CCT TTA AGT TTG CGA AGT 480 
Val Asn Ala Leu Asn Ser Trp Lys Lys Thr Pro Leu Ser Leu Arg Ser 
145 150 155 160 

AAA AGA AGC CAA GAT CGA ATA AGG GAA CTT TTT TCT CAA GCA GAA AGT 528 
Lys Arg Ser Gin Asp Arg He Arg Glu Leu Phe Ser Gin Ala Glu Ser 
165 170 175 

CAT TTT CGT AAT TCC ATG CCG TCA TTT GCA GTT TCC AAA TTC GAA GTG 576 
His Phe Arg Asn Ser Met Pro Ser Phe Ala Val Ser Lys Phe Glu Val 
180 185 190 
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CTG TTT CTA CCA ACA TAT GCA CAA GCT GCA AAT AC A CAT TTA TTG CTA 624 
Leu Phe Leu Pro Thr Tyr Ala Gin Ala Ala Asn Thr His Leu Leu Leu 
195 200 205 

TTA AAA GAT GCT CAA GTT TTT GGA GAA GAA TGG GGA TAT TCT TCA GAA 6 72 

Leu Lys Asp Ala Gin Val Phe Gly Glu Glu Trp Gly Tyr Ser Ser Glu 
210 215 220 

GAT GTT GCT GAA TTT TAT CAT AGA CAA TTA AAA CTT ACA CAA CAA TAC 720 
Asp Val Ala Glu Phe Tyr His Arg Gin Leu Lys Leu Thr Gin Gin Tyr 
225 230 235 240 

ACT GAC CAT TGT GTT AAT TGG TAT AAT GTT GGA TTA AAT GGT TTA AGA 768 
Thr Asp His Cys Val Asn Trp Tyr Asn Val Gly Leu Asn Gly Leu Arg 
245 250 255* 

GGT TCA ACT TAT GAT GCA TGG GTC AAA TTT AAC CGT TTT CGC AGA GAA 816 
Gly Ser Thr Tyr Asp Ala Trp Val Lys Phe Asn Arg Phe Arg Arg Glu 
260 265 270 

ATG ACT TTA ACT GTA TTA GAT CTA ATT GTA CTT TTC CCA TTT TAT GAT 864 
Met Thr Leu Thr Val Leu Asp Leu lie Val Leu Phe Pro Phe Tyr Asp 
275 280 285 

ATT CGG TTA TAC TCA AAA GGG GTT AAA ACA GAA CTA ACA AGA GAC ATT 912 
lie Arg Leu Tyr Ser Lys Gly Val Lys Thr Glu Leu Thr Arg Asp lie 
290 295 300 

TTT ACG GAT CCA ATT TTT TCA CTT AAT ACT CTT CAG GAG TAT GGA CCA 960 
Phe Thr Asp Pro lie Phe Ser Leu Asn Thr Leu Gin Glu Tyr Gly Pro 
305 310 315 320 

ACT TTT TTG AGT ATA GAA AAC TCT ATT CGA AAA CCT CAT TTA TTT GAT 1008 
Thr Phe Leu Ser lie Glu Asn Ser lie Arg Lys Pro His Leu Phe Asp 
325 330 335 

TAT TTA CAG GGG ATT GAA TTT CAT ACG CGT CTT CGA CCT GGT TAC TTT 1056 
Tyr Leu Gin Gly lie Glu Phe His Thr Arg Leu Arg Pro Gly Tyr Phe 
340 345 350 

GGG AAA GAT TCT TTC AAT TAT TGG TCT GGT AAT TAT GTA GAA ACT AGA 1104 
Gly Lys Asp Ser Phe Asn Tyr Trp Ser Gly Asn Tyr Val Glu Thr Arg 
355 360 365 

CCT AGT ATA GGA TCT AGT AAG ACA ATT ACT TCC CCA TTT TAT GGA GAT 1152 
Pro Ser lie Gly Ser Ser Lys Thr lie Thr Ser Pro Phe Tyr Gly Asp 
370 375 380 

AAA TCT ACT GAA CCT GTA CAA AAG CTA AGC TTT GAT GGA CAA AAA GTT 1200 
Lys Ser Thr Glu Pro Val Gin Lys Leu Ser Phe Asp Gly Gin Lys Val 
385 390 395 400 
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TAT CGA ACT ATA GCT AAT ACA GAC GTA GCG GCT TGG CCG AAT GGT AAG 124 8 

Tyr Arg Thr lie Ala Asn Thr Asp Val Ala Ala Trp Pro Asn Gly Lys 
405 410 415 

GTA TAT TTA GGT GTT ACG AAA GTT GAT TTT AGT CAA TAT GAT GAT CAA 12 96 

Val Tyr Leu Gly Val Thr Lys Val Asp Phe Ser Gin Tyr Asp Asp Gin 
420 425 430 

AAA AAT GAA ACT AGT ACA CAA ACA TAT GAT TCA AAA AGA AAC AAT GGC 1344 
Lys Asn Glu Thr Ser Thr Gin Thr Tyr Asp Ser Lys Arg Asn Asn Gly 
435 440 445 

CAT GTA AGT GCA CAG GAT TCT ATT GAC CAA TTA CCG CCA GAA ACA ACA 13 92 

His Val Ser Ala Gin Asp Ser lie Asp Gin Leu Pro Pro Glu Thr Thr 
450 455 6 460 

GAT GAA CCA CTT GAA AAA GCA TAT AGT CAT CAG CTT AAT TAC GCG GAA 14 4 0 

Asp Glu Pro Leu Glu Lys Ala Tyr Ser His Gin Leu Asn Tyr Ala Glu 
465 470 475 480 

TGT TTC TTA ATG CAG GAC CGT CGT GGA ACA ATT CCA TTT TTT ACT TGG 14 88 

Cys Phe Leu Met Gin Asp Arg Arg Gly Thr lie Pro Phe Phe Thr Trp 
485 490 495 

ACA CAT AGA AGT GTA GAC TTT TTT AAT ACA ATT GAT GCT GAA AAG ATT 1536 
Thr His Arg Ser Val Asp Phe Phe Asn Thr lie Asp Ala Glu Lys lie 
500 505 510 

ACT CAA CTT CCA GTA GTG AAA GCA TAT GCC TTG TCT TCA GGT GCT TCC 1584 
Thr Gin Leu Pro Val Val Lys Ala Tyr Ala Leu Ser Ser Gly Ala Ser 
515 520 525 

ATT ATT GAA GGT CCA GGA TTC ACA GGA GGA AAT TTA CTA TTC CTA AAA 163 2 

lie lie Glu Gly Pro Gly Phe Thr Gly Gly Asn Leu Leu Phe Leu Lys 
530 535 540 

GAA TCT AGT AAT TCA ATT GCT AAA TTT AAA GTT ACA TTA AAT TCA GCA 1680 
Glu Ser Ser Asn Ser lie Ala Lys Phe Lys Val Thr Leu Asn Ser Ala 
545 550 555 560 

GCC TTG TTA CAA CGA TAT CGT GTA AGA ATA CGC TAT GCT TCT ACC ACT 1728 
Ala Leu Leu Gin Arg Tyr Arg Val Arg lie Arg Tyr Ala Ser Thr Thr 
565 570 575 

AAC TTA CGA CTT TTT GTG CAA AAT TCA AAC AAT GAT TTT CTT GTC ATC 1776 
Asn Leu Arg Leu Phe Val Gin Asn Ser Asn Asn Asp Phe Leu Val lie 
580 585 590 

TAC ATT AAT AAA ACT ATG AAT AAA GAT GAT GAT TTA ACA TAT CAA ACA 1824 
Tyr He Asn Lys Thr Met Asn Lys Asp Asp Asp Leu Thr Tyr Gin Thr 
595 600 605 
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TTT GAT CTC GCA ACT ACT AAT TCT AAT ATG GGG TTC TCG GGT GAT AAG 
Phe Asp Leu Ala Thr Thr Asn Ser Asn Met Gly Phe Ser Gly Asp Lys 
610 615 620 



AAT GAA CTT ATA ATA GGA GCA GAA TCT TTC GTT TCT AAT GAA AAA ATC 

Asn Glu Leu lie lie Gly Ala Glu Ser Phe Val Ser Asn Glu Lys lie 
625 630 635 640 

TAT ATA GAT AAG ATA GAA TTT ATC CCA GTA CAA TTG TAA 

Tyr He Asp Lys He Glu Phe He Pro Val Gin Leu 
645 650 



(2) INFORMATION FOR SEQ ID NO: 68: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6 52 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 68: 

Met Asn Pro Asn Asn Arg Ser Glu His Asp Thr He Lys Val Thr Pro 
15 10 15 

Asn Ser Glu Leu Gin Thr Asn His Asn Gin Tyr Pro Leu Ala Asp Asn 
20 25 30 

Pro Asn Ser Thr Leu Glu Glu Leu Asn Tyr Lys Glu Phe Leu Arg Met 
35 40 45 

Thr Glu Asp Ser Ser Thr Glu Val Leu Asp Asn Ser Thr Val Lys Asp 
50 55 60 

Ala Val Gly Thr Gly He Ser Val Val Gly Gin He Leu Gly Val Val 
65 70 75 80 

Gly Val Pro Phe Ala Gly Ala Leu Thr Ser Phe Tyr Gin Ser Phe Leu 
85 90 95 

Asn Thr lie Trp Pro Ser Asp Ala Asp Pro Trp Lys Ala Phe Met Ala 
100 105 110 

Gin Val Glu Val Leu lie Asp Lys Lys lie Glu Glu Tyr Ala Lys Ser 
115 120 125 

Lys Ala Leu Ala Glu Leu Gin Gly Leu Gin Asn Asn Phe Glu Asp Tyr 
130 135 140 

Val Asn Ala Leu Asn Ser Trp Lys Lys Thr Pro Leu Ser Leu Arg Ser 
145 150 155 160 
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Lys Arg Ser Gin Asp Arg lie Arg Glu Leu Phe Ser Gin Ala Glu Ser 
165 170 175 



His Phe Arg Asn Ser Met Pro Ser Phe Ala Val Ser Lys Phe Glu Val 
180 185 190 

Leu Phe Leu Pro Thr Tyr Ala Gin Ala Ala Asn Thr His Leu Leu Leu 
195 200 205 

Leu Lys Asp Ala Gin Val Phe Gly Glu Glu Trp Gly Tyr Ser Ser Glu 
210 215 220 

Asp Val Ala Glu Phe Tyr His Arg Gin Leu Lys Leu Thr Gin Gin Tyr 
225 230 235 240 

Thr Asp His Cys Val Asn Trp Tyr Asn Val Gly Leu Asn Gly Leu Arg 
245 250 255 

Gly Ser Thr Tyr Asp Ala Trp Val Lys Phe Asn Arg Phe Arg Arg Glu 
260 265 270 

Met Thr Leu Thr Val Leu Asp Leu lie Val Leu Phe Pro Phe Tyr Asp 
275 280 285 

lie Arg Leu Tyr Ser Lys Gly Val Lys Thr Glu Leu Thr Arg Asp lie 
290 295 300 

Phe Thr Asp Pro lie Phe Ser Leu Asn Thr Leu Gin Glu Tyr Gly Pro 
305 310 315 320 

Thr Phe Leu Ser lie Glu Asn Ser lie Arg Lys Pro His Leu Phe Asp 
325 330 335 

Tyr Leu Gin Gly lie Glu Phe His Thr Arg Leu Arg Pro Gly Tyr Phe 
340 345 350 

Gly Lys Asp Ser Phe Asn Tyr Trp Ser Gly Asn Tyr Val Glu Thr Arg 
355 360 365 

Pro Ser lie Gly Ser Ser Lys Thr lie Thr Ser Pro Phe Tyr Gly Asp 
370 375 380 

Lys Ser Thr Glu Pro Val Gin Lys Leu Ser Phe Asp Gly Gin Lys Val 
385 390 395 400 

Tyr Arg Thr lie Ala Asn Thr Asp Val Ala Ala Trp Pro Asn Gly Lys 
405 410 415 

Val Tyr Leu Gly Val Thr Lys Val Asp Phe Ser Gin Tyr Asp Asp Gin 
420 425 430 

Lys Asn Glu Thr Ser Thr Gin Thr Tyr Asp Ser Lys Arg Asn Asn Gly 
435 440 445 
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His Val Ser Ala Gin Asp Ser lie Asp Gin Leu Pro Pro Glu Thr Thr 
450 455 460 

Asp Glu Pro Leu Glu Lys Ala Tyr Ser His Gin Leu Asn Tyr Ala Glu 
465 470 475 480 

Cys Phe Leu Met Gin Asp Arg Arg Gly Thr lie Pro Phe Phe Thr Trp 
485 490 495 

Thr His Arg Ser Val Asp Phe Phe Asn Thr He Asp Ala Glu Lys He 
500 505 510 

Thr Gin Leu Pro Val Val Lys Ala Tyr Ala Leu Ser Ser Gly Ala Ser 
515 520 525 

He lie Glu Gly Pro Gly Phe Thr Gly Gly Asn Leu Leu Phe Leu Lys 
530 535 540 

Glu Ser Ser Asn Ser He Ala Lys Phe Lys Val Thr Leu Asn Ser Ala 
545 550 555 560 

Ala Leu Leu Gin Arg Tyr Arg Val Arg lie Arg Tyr Ala Ser Thr Thr 
565 570 575 

Asn Leu Arg Leu Phe Val Gin Asn Ser Asn Asn Asp Phe Leu Val He 
580 585 590 

Tyr He Asn Lys Thr Met Asn Lys Asp Asp Asp Leu Thr Tyr Gin Thr 
595 600 605 

Phe Asp Leu Ala Thr Thr Asn Ser Asn Met Gly Phe Ser Gly Asp Lys 
610 615 620 

Asn Glu Leu He He Gly Ala Glu Ser Phe Val Ser Asn Glu Lys He 
625 630 635 640 

Tyr lie Asp Lys He Glu Phe lie Pro Val Gin Leu 
645 650 



(2) INFORMATION FOR SEQ ID NO: 69: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1482 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ix) FEATURE: 

(A) NAME /KEY : CDS 

(B) LOCATION: 1..1479 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 69 



AGT AAA AGA AGC CAA GAT CGA ATA AGG GAA CTT TTT TCT CAA GCA GAA 4 8 

Ser Lys Arg Ser Gin Asp Arg lie Arg Glu Leu Phe Ser Gin Ala Glu 
15 10 15 

AGT CAT TTT CGT AAT TCC ATG CCG TCA TTT GCA GTT TCC AAA TTC GAA 96 
Ser His Phe Arg Asn Ser Met Pro Ser Phe Ala Val Ser Lys Phe Glu 
20 25 30 

GTG CTG TTT CTA CCA ACA TAT GCA CAA GCT GCA AAT ACA CAT TTA TTG 144 
Val Leu Phe Leu Pro Thr Tyr Ala Gin Ala Ala Asn Thr His Leu Leu 
35 40 45 

CTA TTA AAA GAT GCT CAA GTT TTT GGA GAA GAA TGG GGA TAT TCT TCA 192 
Leu Leu Lys Asp Ala Gin Val Phe Gly Glu Glu Trp Gly Tyr Ser Ser 
50 55 60 

GAA GAT GTT GCT GAA TTT TAT CAT AGA CAA TTA AAA CTT ACA CAA CAA 240 
Glu Asp Val Ala Glu Phe Tyr His Arg Gin Leu Lys Leu Thr Gin Gin 
65 70 75 80 

TAC ACT GAC CAT TGT GTT AAT TGG TAT AAT GTT GGA TTA AAT GGT TTA 2 88 

Tyr Thr Asp His Cys Val Asn Trp Tyr Asn Val Gly Leu Asn Gly Leu 
85 90 95 

AGA GGT TCA ACT TAT GAT GCA TGG GTC AAA TTT AAC CGT TTT CGC AGA 3 36 

Arg Gly Ser Thr Tyr Asp Ala Trp Val Lys Phe Asn Arg Phe Arg Arg 
100 105 110 

GAA ATG ACT TTA ACT GTA TTA GAT CTA ATT GTA CTT TTC CCA TTT TAT 384 
Glu Met Thr Leu Thr Val Leu Asp Leu He Val Leu Phe Pro Phe Tyr 
115 120 125 

GAT ATT CGG TTA TAC TCA AAA GGG GTT AAA ACA GAA CTA ACA AGA GAC 4 32 

Asp He Arg Leu Tyr Ser Lys Gly Val Lys Thr Glu Leu Thr Arg Asp 
130 135 140 

ATT TTT ACG GAT CCA ATT TTT TCA CTT AAT ACT CTT CAG GAG TAT GGA 4 80 

He Phe Thr Asp Pro He Phe Ser Leu Asn Thr Leu Gin Glu Tyr Gly 
145 150 155 160 

CCA ACT TTT TTG AGT ATA GAA AAC TCT ATT CGA AAA CCT CAT TTA TTT 528 
Pro Thr Phe Leu Ser He Glu Asn Ser He Arg Lys Pro His Leu Phe 
165 170 175 

GAT TAT TTA CAG GGG ATT GAA TTT CAT ACG CGT CTT CAA CCT GGT TAC 5 76 

Asp Tyr Leu Gin Gly He Glu Phe His Thr Arg Leu Gin Pro Gly Tyr 
180 185 190 

TTT GGG AAA GAT TCT TTC AAT TAT TGG TCT GGT AAT TAT GTA GAA ACT 624 
Phe Gly Lys Asp Ser Phe Asn Tyr Trp Ser Gly Asn Tyr Val Glu Thr 
195 200 205 
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AGA CCT AGT ATA GGA TCT AGT AAG ACA ATT ACT TCC CCA TTT TAT GGA 6 72 

Arg Pro Ser lie Gly Ser Ser Lys Thr lie Thr Ser Pro Phe Tyr Gly 
210 215 220 

GAT AAA TCT ACT GAA CCT GTA CAA AAG CTA AGC TTT GAT GGA CAA AAA 7 20 

Asp Lys Ser Thr Glu Pro Val Gin Lys Leu Ser Phe Asp Gly Gin Lys 
225 230 235 240 

GTT TAT CGA ACT ATA GCT AAT ACA GAC GTA GCG GCT TGG CCG AAT GGT 7 68 

Val Tyr Arg Thr lie Ala Asn Thr Asp Val Ala Ala Trp Pro Asn Gly 
245 250 255 

AAG GTA TAT TTA GGT GTT ACG AAA GTT GAT TTT AGT CAA TAT GAT GAT 816 
Lys Val Tyr Leu Gly Val Thr Lys Val Asp Phe Ser Gin Tyr Asp Asp 
260 265 270 

CAA AAA AAT GAA ACT AGT ACA CAA ACA TAT GAT TCA AAA AGA AAC AAT 864 
Gin Lys Asn Glu Thr Ser Thr Gin Thr Tyr Asp Ser Lys Arg Asn Asn 
275 280 285 

GGC CAT GTA AGT GCA CAG GAT TCT ATT GAC CAA TTA CCG CCA GAA ACA 912 
Gly His Val Ser Ala Gin Asp Ser lie Asp Gin Leu Pro Pro Glu Thr 
290 295 300 

ACA GAT GAA CCA CTT GAA AAA GCA TAT AGT CAT CAG CTT AAT TAC GCG 96 0 

Thr Asp Glu Pro Leu Glu Lys Ala Tyr Ser His Gin Leu Asn Tyr Ala 
305 310 315 320 

GAA TGT TTC TTA ATG CAG GAC CGT CGT GGA ACA ATT CCA TTT TTT ACT 1008 
Glu Cys Phe Leu Met Gin Asp Arg Arg Gly Thr lie Pro Phe Phe Thr 
325 330 335 

TGG ACA CAT AGA AGT GTA GAC TTT TTT AAT ACA ATT GAT GCT GAA AAG 1056 
Trp Thr His Arg Ser Val Asp Phe Phe Asn Thr lie Asp Ala Glu Lys 
340 345 350 

ATT ACT CAA CTT CCA GTA GTG AAA GCA TAT GCC TTG TCT TCA GGT GCT 1104 
lie Thr Gin Leu Pro Val Val Lys Ala Tyr Ala Leu Ser Ser Gly Ala 
355 360 365 

TCC ATT ATT GAA GGT CCA GGA TTC ACA GGA GGA AAT TTA CTA TTC CTA 1152 
Ser lie lie Glu Gly Pro Gly Phe Thr Gly Gly Asn Leu Leu Phe Leu 
370 375 380 

AAA GAA TCT AGT AAT TCA ATT GCT AAA TTT AAA GTT ACA TTA AAT TCA 1200 
Lys Glu Ser Ser Asn Ser He Ala Lys Phe Lys Val Thr Leu Asn Ser 
385 390 395 400 

GCA GCC TTG TTA CAA CGA TAT CGT GTA AGA ATA CGC TAT GCT TCT ACC 1248 
Ala Ala Leu Leu Gin Arg Tyr Arg Val Arg He Arg Tyr Ala Ser Thr 
405 410 415 
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ACT AAC TTA CGA CTT TTT GTG CAA AAT TCA AAC AAT GAT TTT CTT GTC 12 96 

Thr Asn Leu Arg Leu Phe Val Gin Asn Ser Asn Asn Asp Phe Leu Val 
420 425 430 



ATC TAC ATT AAT AAA ACT ATG AAT AAA GAT GAT GAT TTA ACA TAT CAA 1344 
lie Tyr lie Asn Lys Thr Met Asn Lys Asp Asp Asp Leu Thr Tyr Gin 
435 440 445 

ACA TTT GAT CTC GCA ACT ACT AAT TCT AAT ATG GGG TTC TCG GGT GAT 13 92 

Thr Phe Asp Leu Ala Thr Thr Asn Ser Asn Met Gly Phe Ser Gly Asp 
450 455 460 

AAG AAT GAA CTT ATA ATA GGA GCA GAA TCT TTC GTT TCT AAT GAA AAA 14 4 0 

Lys Asn Glu Leu lie lie Gly Ala Glu Ser Phe Val Ser Asn Glu Lys 
465 470 475 480 

ATC TAT ATA GAT AAG ATA GAA TTT ATC CCA GTA CAA TTG TAA 14 82 

lie Tyr lie Asp Lys lie Glu Phe lie Pro Val Gin Leu 
485 490 



(2) INFORMATION FOR SEQ ID NO: 70: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 93 amino acids 

(B) TYPE: amino acid 
<D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 70: 

Ser Lys Arg Ser Gin Asp Arg He Arg Glu Leu Phe Ser Gin Ala Glu 
1 5 ' 10 15 

Ser His Phe Arg Asn Ser Met Pro Ser Phe Ala Val Ser Lys Phe Glu 
20 25 30 

Val Leu Phe Leu Pro Thr Tyr Ala Gin Ala Ala Asn Thr His Leu Leu 
35 40 45 

Leu Leu Lys Asp Ala Gin Val Phe Gly Glu Glu Trp Gly Tyr Ser Ser 
50 55 60 

Glu Asp Val Ala Glu Phe Tyr His Arg Gin Leu Lys Leu Thr Gin Gin 
65 70 75 80 

Tyr Thr Asp His Cys Val Asn Trp Tyr Asn Val Gly Leu Asn Gly Leu 
85 90 95 

Arg Gly Ser Thr Tyr Asp Ala Trp Val Lys Phe Asn Arg Phe Arg Arg 
100 105 110 
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Glu Met Thr Leu Thr Val Leu Asp Leu lie Val Leu Phe Pro Phe Tyr 
115 120 125 



Asp lie Arg Leu Tyr Ser Lys Gly Val Lys Thr Glu Leu Thr Arg Asp 
130 135 140 

lie Phe Thr Asp Pro lie Phe Ser Leu Asn Thr Leu Gin Glu Tyr Gly 
145 150 155 160 

Pro Thr Phe Leu Ser lie Glu Asn Ser lie Arg Lys Pro His Leu Phe 
165 170 175 

Asp Tyr Leu Gin Gly lie Glu Phe His Thr Arg Leu Gin Pro Gly Tyr 
180 185 190 

Phe Gly Lys Asp Ser Phe Asn Tyr Trp Ser Gly Asn Tyr Val Glu Thr 
195 200 205 

Arg Pro Ser lie Gly Ser Ser Lys Thr He Thr Ser Pro Phe Tyr Gly 
210 215 220 

Asp Lys Ser Thr Glu Pro Val Gin Lys Leu Ser Phe Asp Gly Gin Lys 
225 230 235 240 

Val Tyr Arg Thr He Ala Asn Thr Asp Val Ala Ala Trp Pro Asn Gly 
245 250 255 

Lys Val Tyr Leu Gly Val Thr Lys Val Asp Phe Ser Gin Tyr Asp Asp 
260 265 270 

Gin Lys Asn Glu Thr Ser Thr Gin Thr Tyr Asp Ser Lys Arg Asn Asn 
275 280 285 

Gly His Val Ser Ala Gin Asp Ser He Asp Gin Leu Pro Pro Glu Thr 
290 295 300 

Thr Asp Glu Pro Leu Glu Lys Ala Tyr Ser His Gin Leu Asn Tyr Ala 
305 310 315 320 

Glu Cys Phe Leu Met Gin Asp Arg Arg Gly Thr He Pro Phe Phe Thr 
325 330 335 

Trp Thr His Arg Ser Val Asp Phe Phe Asn Thr lie Asp Ala Glu Lys 
340 345 350 

lie Thr Gin Leu Pro Val Val Lys Ala Tyr Ala Leu Ser Ser Gly Ala 
355 360 365 

Ser lie lie Glu Gly Pro Gly Phe Thr Gly Gly Asn Leu Leu Phe Leu 
370 375 380 



Lys Glu Ser Ser Asn Ser He Ala Lys Phe Lys Val Thr Leu Asn Ser 
385 390 395 400 
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Ala Ala Leu Leu Gin Arg Tyr Arg Val Arg lie Arg Tyr Ala Ser Thr 
405 410 415 



Thr Asn Leu Arg Leu Phe Val Gin Asn Ser Asn Asn Asp Phe Leu Val 
420 425 430 

lie Tyr lie Asn Lys Thr Met Asn Lys Asp Asp Asp Leu Thr Tyr Gin 
435 440 445 

Thr Phe Asp Leu Ala Thr Thr Asn Ser Asn Met Gly Phe Ser Gly Asp 
450 455 460 

Lys Asn Glu Leu lie lie Gly Ala Glu Ser Phe Val Ser Asn Glu Lys 
465 470 475 480 

lie Tyr lie Asp Lys He Glu Phe lie Pro Val Gin Leu 
485 490 



(2) INFORMATION FOR SEQ ID NO: 71: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

( D ) TOPOLOGY : 1 inear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 71: 
AGACAACTCT ACAGTAAAAG ATG 



(2) INFORMATION FOR SEQ ID NO: 72: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 72: 
GGTAATTGGT CAATAGAATC 



(2) INFORMATION FOR SEQ ID NO: 73: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 39 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(ix) FEATURE: 

(A) NAME /KEY : modi f ied_base 

(B) LOCATION: 21. .23 

(D) OTHER INFORMATION: /note= "N = A, T # G, C (25% each) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 73: 
CAGAAGATGT TGCTGAATTC NNNCATAGAC AATTAAAAC 



(2) INFORMATION FOR SEQ ID NO: 74: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 34 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ix) FEATURE: 

(A) NAME / KEY : modi f ied_base 

(B) LOCATION: 19. .21 

(D) OTHER INFORMATION: /note= "N = A, T, G, C (25% each) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 74: 
GATGTTGCTG AATTCTATNN NAGACAATTA AAAC 



(2) INFORMATION FOR SEQ ID NO: 75: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3 3 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ix) FEATURE: 

(A) NAME/KEY: modif ied_base 

(B) LOCATION: 17 

(D) OTHER INFORMATION: /note= " N = A, T, C (16% each); G 

(52%) " 

(ix) FEATURE: 

(A) NAME/KEY: modif ied_base 

(B) LOCATION: 18 

(D) OTHER INFORMATION: /note= "N = T, G, C (10% each); A 

(70%)" 

(ix) FEATURE: 

(A) NAME/KEY: modi f ied_base 

(B) LOCATION: 19 

(D) OTHER INFORMATION: /note= "N = A, T, G, C (25% each) 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 7 5 : 



CCCATTTTAT GATATTNNNT TATACTCAAA AGG 



(2) INFORMATION FOR SEQ ID NO: 76: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 64 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ix) FEATURE: 

(A) NAME / KEY : modi f ied_base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: /notes "N = T, G, C (6% each) ; A 

(82%) " 

(ix) FEATURE: 

(A) NAME / KEY : modif ied__base 

(B) LOCATION: one-of(25, 27, 28, 30, 34, 36, 39, 43) 

(D) OTHER INFORMATION: /note= "N = A, T, G (6% each) ; C 

(82%)" 

(ix) FEATURE: 

(A) NAME /KEY : modif ied_base 

(B) LOCATION: one-of(31, 33, 35, 37, 42, 44) 

(D) OTHER INFORMATION: /note= "N = A, G, C (6% each) ; T 

(82%) " 

(ix) FEATURE: 

(A) NAME /KEY : modif ied_base 

(B) LOCATION: 4 0 

(D) OTHER INFORMATION: /note= "N = A, T, C (6% each) ; G 

(82%)" 

(ix) FEATURE: 

(A) NAME /KEY : modif ied_base 

(B) LOCATION: one-of(26, 29, 32, 38, 41) 

(D) OTHER INFORMATION: /note- »N = A, T, G, C (25% each) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 76: 
AGCTATGCTG GTCTCGGAAG AAANNNNNNN NNNNNNNNNN NNNNAAAAGA AGCCAAGATC 
GAAT 



(2) INFORMATION FOR SEQ ID NO: 77: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 0 base pairs 

(B) TYPE: nucleic acid 
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(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 77: 
GGTCACCTAG GTCTCTCTTC CAGGAATTTA ACGCATTAAC 



(2) INFORMATION FOR SEQ ID NO: 78: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6 5 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ix) FEATURE: 

(A) NAME /KEY : modif ied_base 

(B) LOCATION: one-of(22, 27, 29, 30, 37, 42) 

(D) OTHER INFORMATION: /note= "N = A, G, C (6% each) ; T 

(82%) ." 

(ix) FEATURE: 

(A) NAME /KEY : modif ied_base 

(B) LOCATION: one-of(23, 26, 28, 31, 38, 40, 43, 44) 

(D) OTHER INFORMATION: /note= "N = T, G, C (6% each); A 

(82%) » 

(ix) FEATURE: 

(A) NAME /KEY: modif ied_base 

(B) LOCATION: one-of(24, 39) 

(D) OTHER INFORMATION: /note= "N = A, T, G (1% each); C 

(97%) » 

(ix) FEATURE: 

(A) NAME / KEY : modif ied_base 

(B) LOCATION: one-of(25, 32, 33, 41, 46, 47, 48) 

(D) OTHER INFORMATION: /note= "N = A, T, C (6% each) ; G 

(82%) " 

(ix) FEATURE: 

(A) NAME / KEY : modif ied_base 

(B) LOCATION: 34 

(D) OTHER INFORMATION: /note= " N = A, T, G (15% each); C 

(55%) " 

(ix) FEATURE: 

(A) NAME/ KEY : modif ied_base 

(B) LOCATION: 45 

(D) OTHER INFORMATION: /note= "N = A, T, G, C (25% each)" 
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(ix) FEATURE: 

(A) NAME /KEY : modif ied_base 

(B) LOCATION: 3 5. .36 

(D) OTHER INFORMATION: /note= 11 N = A, G, C (15% each) ; T 

(55%) " 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 78: 
AGCTATGCTG GTCTCCCATT TNNNNNNNNN NNNNNNNNNN NNNNNNNNGT TAAAACAGAA 
CTAAC 



(2) INFORMATION FOR SEQ ID NO: 79: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 36 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 79: 
ATCCAGTGGG GTCTCAAATG GGAAAAGTAC AATTAG 



(2) INFORMATION FOR SEQ ID NO:80: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 63 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ix) FEATURE: 

(A) NAME /KEY : modif ied_base 

(B) LOCATION: one-of(23, 27, 31, 36, 44) 

(D) OTHER INFORMATION: /note= "N = A, G, C (6% each) ; T 

(82%) " 

(ix) FEATURE: 

(A) NAME /KEY: modif ied_base 

(B) LOCATION: one-of(24, 25, 26, 33, 35, 38) 

(D) OTHER INFORMATION: /note= "N = A, T, G (6% each) ; C 

(82%) " 

(ix) FEATURE: 

(A) NAME/KEY: modif ied_base 

(B) LOCATION: one-of(28, 34, 37) 

(D) OTHER INFORMATION: /note= "N = A, T, G, C (25% each)" 
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(ix) FEATURE: 

(A) NAME /KEY : modif ied_base 

(B) LOCATION: one-of(29, 30, 32, 39, 42, 45) 

(D) OTHER INFORMATION: /note= "N = T, G, C {6% each) ; A 

(82%) " 

(ix) FEATURE: 

(A) NAME /KEY : modif ied_base 

(B) LOCATION: one-of(40, 43) 

(D) OTHER INFORMATION: /note= "N = A, T, C (6% each) ; G 

(82%) " 

(ix) FEATURE: 

(A) NAME /KEY : modif ied_base 

(B) LOCATION: 41 

(D) OTHER INFORMATION: /note= "N = A, C (8% each); T (1%) 

G (83%)" 

(ix) FEATURE: 

(A) NAME /KEY : modif ied_base 

(B) LOCATION: 4 6 

(D) OTHER INFORMATION: /note= "N = A, T, G (1% each); C 

(97%) " 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 80: 
CATTTTTACG GATCCAATTT TTNNNNNNNN NNNNNNNNNN NNNNNNGGAC CAACTTTTTT 
GAG 



(2) INFORMATION FOR SEQ ID NO: 81: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 62 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ix) FEATURE: 

(A) NAME /KEY : modif ied_base 

(B) LOCATION: one-of(28, 31, 32, 33, 42) 

(D) OTHER INFORMATION: /note= "N = A, G, C (6% each) ; T 

(82%)" 

(ix) FEATURE: 

(A) NAME /KEY : modif ied_base 

(B) LOCATION: one-of(29, 38, 39, 41) 

(D) OTHER INFORMATION: /note= "N = T, G, C (6% each); A 

(82%) " 
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(ix) 



FEATURE : 

(A) NAME /KEY : modi f ied_base 

(B) LOCATION: 30 

(D) OTHER INFORMATION: /note= 



"N 



A, T, G (1% each) ; C 



(97%) 



it 



(ix) 



FEATURE : 

(A) NAME /KEY : modif ied_base 

(B) LOCATION: one-of(34, 35, 40) 
(D) OTHER INFORMATION: /note= n N 



A, T, C (6% each) ; G 



(82%) 



ii 



(ix) FEATURE: 

(A) NAME / KEY : modif ied_base 

(B) LOCATION: 36 

(D) OTHER INFORMATION: /note= "N = A, T, G, C (25% each)" 

(ix) FEATURE: 

(A) NAME/ KEY: modif ied_base 

(B) LOCATION: 3 7 

(D) OTHER INFORMATION: /note= "N = A (82%); T (2%); G, C 
(8% each) " 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:81: 
GAATTTCATA CGCGTCTTCA ACCTGGTNNN NNNNNNNNNN NNTCTTTCAA TTATTGGTCT 6 0 

GG 62 



(2) INFORMATION FOR SEQ ID NO: 82: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 73 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ix) FEATURE: 

(A) NAME /KEY: modif ied_base 

(B) LOCATION: one-of(41, 49, 52) 

(D) OTHER INFORMATION: /note= "N » A, G, C (6% each); T 



(82%) " 



(ix) 



FEATURE : 

(A) NAME/KEY: modif ied_base 

(B) LOCATION: 42 . .43 

(D) OTHER INFORMATION: /note 



"N = A (0%); T, C (9% each); 



G (82% 



) 
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(ix) FEATURE : 

(A) NAME /KEY : modi f ied_base 

(B) LOCATION: 44 . . 4 5 

(D) OTHER INFORMATION: /note= n N = A, T, G (6% each); C 



(ix) FEATURE: 

(A) NAME/ KEY : modif iedjbase 

(B) LOCATION: 46 

(D) OTHER INFORMATION: /note= "N = A, T, G, C (25% each) 



(ix) FEATURE: 

(A) NAME /KEY : modif ied__base 

(B) LOCATION: one-of(47, 48, 53, 54) 

(D) OTHER INFORMATION: /note= "N = T, G, C (6% each); A 

(82%) » 



(ix) FEATURE: 

(A) NAME /KEY : modif ied_base 

(B) LOCATION: one-of(50, 51, 55) 

(D) OTHER INFORMATION: /note= 11 N = A, T, C (6% each); G 

(82%) " 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 82: 



AAAAGTTTAT CGAACTATAG CTAATACAGA CGTAGCGGCT NNNNNNNNNN NNNNNGTATA 



TTTAGGTGTT ACG 



(2) INFORMATION FOR SEQ ID NO: 83: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 0 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:83: 
GGAGTTCCAT TTGCTGGGGC 



(2) INFORMATION FOR SEQ ID NO: 84: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 17 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:84: 
ATCTCCATAA AATGGGG 
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(2) INFORMATION FOR SEQ ID NO: 85: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3 2 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 85: 
GCGAAGTAAA AGAAGCCAAG GTCGAATAAG GG 



(2) INFORMATION FOR SEQ ID NO: 86: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 43 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 86: 
CCTTTAAGTT TGCGAAATCC ACACAGCCAA GGTCGAATAA GGG 



(2) INFORMATION FOR SEQ ID NO: 87: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 35 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single- 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 87: 
CCCATTTTAT GATGTTCGGT TATACCCAAA AGGGG 



(2) INFORMATION FOR SEQ ID NO: 88: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 5 base pairs 

(B) TYPE': nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 88: 
GGCCAAGTGA AGACCCATGG AAGGC 
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(2) INFORMATION FOR SEQ ID NO: 89: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 89: 
GCAGTTTCCG GATTCGAAGT GC 22 



(2) INFORMATION FOR SEQ ID NO: 90: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 17 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 90: 
CCGCTACGTC TGTATTA 17 



(2) INFORMATION FOR SEQ ID NO: 91: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 17 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 91: 
ATAATGGAAG CACCTGA 17 



(2) INFORMATION FOR SEQ ID NO: 92: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 60 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ix) FEATURE : 

(A) NAME /KEY : modif ied_base 

(B) LOCATION: one-of(22, 26, 29) 

(D) OTHER INFORMATION: /note= "N = T, G, C (6% each) ; A 

(82%) " 
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(ix) 



FEATURE : 



(A) NAME /KEY : modi f ied_base 

(B) LOCATION: one-of(23, 33, 
(D) OTHER INFORMATION: /note 



36) 

"N = A, G, C (6% each) ; T 



(82%) 



(ix) 



FEATURE : 



(A) NAME /KEY : modif ied_base 

(B) LOCATION: one-of (24, 27, 
(D) OTHER INFORMATION: /note 



28, 32, 35, 37, 38) 
"N = A, T, C (6% each) ; G 



(82%) 



n 



(ix) 



FEATURE : 



(A) NAME / KEY : modif ied_base 

(B) LOCATION: one-of(25, 30, 
(D) OTHER INFORMATION: /note 



31, 34) 
"N ^ A, T, G (6% each) ; C 



(82%) " 



(ix) FEATURE: 

(A) NAME / KEY : modi f ied_base 

(B) LOCATION: 39 

(D) OTHER INFORMATION: /note- "N = A, T, G, C (25% each) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 92: 
AGCTATGCTG GTCTCTTCTT ANNNNNNNNN NNNNNNNNNA CAATTCCATT TTTTACTTGG 



(2) INFORMATION FOR SEQ ID NO: 93: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 40 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 93: 
ATCCAGTTGG GTCTCTAAGA AACAAACCGC GTAATTAAGC 



(2) INFORMATION FOR SEQ ID NO: 94: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:94: 
CCTCAAGGGT TATAACATCC 
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(2) INFORMATION FOR SEQ ID NO: 95: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH : 55 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(ix) FEATURE: 

(A) NAME/ KEY : modi f ied_base 

<B) LOCATION: one-of(19, 22, 23, 31) 

(D) OTHER INFORMATION: /note= »N = A, T, C (6% each) ; G 

(82%) " 



(ix) FEATURE: 

(A) NAME /KEY : modi f ied_base 

(B) LOCATION: one-of(20, 26, 21, 29, 30, 35) 

(D) OTHER INFORMATION: /note= "N = T, G, C (6% each) ; A 

(82%) « 



(ix) FEATURE: 

(A) NAME/ KEY : modif ied__base 

(B) LOCATION: one-of(21, 32, 34) 

(D) OTHER INFORMATION: /note= "N = A, G, C (6% each) ; T 

(82%) " 



(ix) FEATURE: 

(A) NAME/ KEY : modif ied_base 

(B) LOCATION: one-of(24, 33) 

(D) OTHER INFORMATION: /note= M N = A, T, G, C (25% each) " 



(ix) FEATURE: 

(A) NAME /KEY : modi f ied_base 

(B) LOCATION: 2 5 

(D) OTHER INFORMATION: /note= "N = A, G (8% each) ; T (2%) 

C (82%)" 



(ix) FEATURE: 

(A) NAME/KEY: modif ied_base 

(B) LOCATION: 2 8 

(D) OTHER INFORMATION: /note= 11 N = A (82%) ; T (2%) ; G, C 
(8% each) " 



(ix) FEATURE: 

(A) NAME/KEY: modif ied_base 

(B) LOCATION: 36 

(D) OTHER INFORMATION: /note= "N = A, G, C (1% each); T 

(97%) " 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 95: 



GTACAAAAGC TAAGCTTTNN NNNNNNNNNN NNNNNNCGAA CTATAGCTAA TACAG 
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(2) INFORMATION FOR SEQ ID NO: 96: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 96: 

Ser Lys Arg Ser Gin Asp Arg 
1 5 



(2) INFORMATION FOR SEQ ID NO: 97: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1959 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 1. .1956 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 97: 

ATG AAT CCA AAC AAT CGA AGT GAA CAT GAT ACG ATA AAG GTT ACA CCT 4 8 

Met Asn Pro Asn Asn Arg Ser Glu His Asp Thr lie Lys Val Thr Pro 
15 10 15 

AAC AGT GAA TTG CAA ACT AAC CAT AAT CAA TAT CCT TTA GCT GAC AAT 96 
Asn Ser Glu Leu Gin Thr Asn His Asn Gin Tyr Pro Leu Ala Asp Asn 
20 25 30 

CCA AAT TCA ACA CTA GAA GAA TTA AAT TAT AAA GAA TTT TTA AGA ATG 144 
Pro Asn Ser Thr Leu Glu Glu Leu Asn Tyr Lys Glu Phe Leu Arg Met 
35 40 45 

ACT GAA GAC AGT TCT ACG GAA GTG CTA GAC AAC TCT ACA GTA AAA GAT 192 
Thr Glu Asp Ser Ser Thr Glu Val Leu Asp Asn Ser Thr Val Lys Asp 
50 55 60 

GCA GTT GGG ACA GGA ATT TCT GTT GTA GGG CAG ATT TTA GGT GTT GTA 240 
Ala Val Gly Thr Gly He Ser Val Val Gly Gin He Leu Gly Val Val 
65 70 75 80 

GGA GTT CCA TTT GCT GGG GCA CTC ACT TCA TTT TAT CAA TCA TTT CTT 288 
Gly Val Pro Phe Ala Gly Ala Leu Thr Ser Phe Tyr Gin Ser Phe Leu 
85 90 95 
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AAC ACT ATA TGG CCA AGT GAT GCT GAC CCA TGG AAG GCT TTT ATG GCA 
Asn Thr lie Trp Pro Ser Asp Ala Asp Pro Trp Lys Ala Phe Met Ala 
100 105 110 

CAA GTT GAA GTA CTG ATA GAT AAG AAA ATA GAG GAG TAT GCT AAA AGT 
Gin Val Glu Val Leu lie Asp Lys Lys lie Glu Glu Tyr Ala Lys Ser 
115 120 125 

AAA GCT CTT GCA GAG TTA CAG GGT CTT CAA AAT AAT TTC GAA GAT TAT 
Lys Ala Leu Ala Glu Leu Gin Gly Leu Gin Asn Asn Phe Glu Asp Tyr 
130 135 140 

GTT AAT GCG TTA AAT TCC TGG AAG AAA ACA CCT TTA AGT TTG CGA AGT 
Val Asn Ala Leu Asn Ser Trp Lys Lys Thr Pro Leu Ser Leu Arg Ser 
145 150 155 160 

AAA AGA AGC CAA GAT CGA ATA AGG GAA CTT TTT TCT CAA GCA GAA AGT 
Lys Arg Ser Gin Asp Arg lie Arg Glu Leu Phe Ser Gin Ala Glu Ser 
165 170 175 

CAT TTT CGT AAT TCC ATG CCG TCA TTT GCA GTT TCC AAA TTC GAA GTG 
His Phe Arg Asn Ser Met Pro Ser Phe Ala Val Ser Lys Phe Glu Val 
180 185 190 

CTG TTT CTA CCA ACA TAT GCA CAA GCT GCA AAT ACA CAT TTA TTG CTA 
Leu Phe Leu Pro Thr Tyr Ala Gin Ala Ala Asn Thr His Leu Leu Leu 
195 200 205 

TTA AAA GAT GCT CAA GTT TTT GGA GAA GAA TGG GGA TAT TCT TCA GAA 
Leu Lys Asp Ala Gin Val Phe Gly Glu Glu Trp Gly Tyr Ser Ser Glu 
210 215 220 

GAT GTT GCT GAA TTT TAT CAT AGA CAA TTA AAA CTT ACA CAA CAA TAC 
Asp Val Ala Glu Phe Tyr His Arg Gin Leu Lys Leu Thr Gin Gin Tyr 
225 230 235 240 

ACT GAC CAT TGT GTT AAT TGG TAT AAT GTT GGA TTA AAT GGT TTA AGA 
Thr Asp His Cys Val Asn Trp Tyr Asn Val Gly Leu Asn Gly Leu Arg 
245 250 255 

GGT TCA ACT TAT GAT GCA TGG GTC AAA TTT AAC CGT TTT CGC AGA GAA 
Gly Ser Thr Tyr Asp Ala Trp Val Lys Phe Asn Arg Phe Arg Arg Glu 
260 265 270 

ATG ACT TTA ACT GTA TTA GAT CTA ATT GTA CTT TTC CCA TTT TAT GAT 
Met Thr Leu Thr Val Leu Asp Leu lie Val Leu Phe Pro Phe Tyr Asp 
275 280 285 

ATT CGG TTA TAC TCA AAA GGG GTT AAA ACA GAA CTA ACA AGA GAC ATT 
lie Arg Leu Tyr Ser Lys Gly Val Lys Thr Glu Leu Thr Arg Asp lie 
290 295 300 
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TTT ACG GAT CCA ATT TTT TCA CTT AAT ACT CTT CAG GAG TAT GGA CCA ■ 96 0 

Phe Thr Asp Pro lie Phe Ser Leu Asn Thr Leu Gin Glu Tyr Gly Pro 
305 310 315 320 

ACT TTT TTG AGT ATA GAA AAC TCT ATT CGA AAA CCT CAT TTA TTT GAT 1008 
Thr Phe Leu Ser lie Glu Asn Ser lie Arg Lys Pro His Leu Phe Asp 
325 330 335 

TAT TTA CAG GGG ATT GAA TTT CAT ACG CGT CTT CAA CCT GGT TAC TTT 1056 
Tyr Leu Gin Gly lie Glu Phe His Thr Arg Leu Gin Pro Gly Tyr Phe 
340 345 350 

GGG AAA GAT TCT TTC AAT TAT TGG TCT GGT AAT TAT GTA GAA ACT AGA 1104 
Gly Lys Asp Ser Phe Asn Tyr Trp Ser Gly Asn Tyr Val Glu Thr Arg 
355 360 365 

CCT AGT ATA GGA TCT AGT AAG ACA ATT ACT TCC CCA TTT TAT GGA GAT 1152 
Pro Ser lie Gly Ser Ser Lys Thr lie Thr Ser Pro Phe Tyr Gly Asp 
370 375 380 

AAA TCT ACT GAA CCT GTA CAA AAG CTA AGC TTT GAT GGA CAA AAA GTT 1200 
Lys Ser Thr Glu Pro Val Gin Lys Leu Ser Phe Asp Gly Gin Lys Val 
385 390 395 400 

TAT CGA ACT ATA GCT AAT ACA GAC GTA GCG GCT TGG CCG AAT GGT AAG 1248 
Tyr Arg Thr lie Ala Asn Thr Asp Val Ala Ala Trp Pro Asn Gly Lys 
405 410 415 

GTA TAT TTA GGT GTT ACG AAA GTT GAT TTT AGT CAA TAT GAT GAT CAA 1296 
Val Tyr Leu Gly Val Thr Lys Val Asp Phe Ser Gin Tyr Asp Asp Gin 
420 425 430 

AAA AAT GAA ACT AGT ACA CAA ACA TAT GAT TCA AAA AGA AAC AAT GGC 1344 
Lys Asn Glu Thr Ser Thr Gin Thr Tyr Asp Ser Lys Arg Asn Asn Gly 
435 440 445 

CAT GTA AGT GCA CAG GAT TCT ATT GAC CAA TTA CCG CCA GAA ACA ACA 1392 
His Val Ser Ala Gin Asp Ser lie Asp Gin Leu Pro Pro Glu Thr Thr 
450 455 460 

GAT GAA CCA CTT GAA AAA GCA TAT AGT CAT CAG CTT AAT TAC GCG GAA 144 0 

Asp Glu Pro Leu Glu Lys Ala Tyr Ser His Gin Leu Asn Tyr Ala Glu 
465 470 475 480 

TGT TTC TTA ATG CAG GAC CGT CGT GGA ACA ATT CCA TTT TTT ACT TGG 1488 
Cys Phe Leu Met Gin Asp Arg Arg Gly Thr lie Pro Phe Phe Thr Trp 
485 490 495 

ACA CAT AGA AGT GTA GAC TTT TTT AAT ACA ATT GAT GCT GAA AAG ATT 1536 
Thr His Arg Ser Val Asp Phe Phe Asn Thr lie Asp Ala Glu Lys lie 
500 505 510 
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ACT CAA CTT CCA GTA GTG AAA GCA TAT GCC TTG TCT TCA GGT GCT TCC 
Thr Gin Leu Pro Val Val Lys Ala Tyr Ala Leu Ser Ser Gly Ala Ser 
515 520 525 



1584 



ATT ATT GAA GGT CCA GGA TTC ACA GGA GGA AAT TTA CTA TTC CTA AAA 1632 
lie lie Glu Gly Pro Gly Phe Thr Gly Gly Asn Leu Leu Phe Leu Lys 
530 535 540 

GAA TCT AGT AAT TCA ATT GCT AAA TTT AAA GTT ACA TTA AAT TCA GCA 1680 
Glu Ser Ser Asn Ser lie Ala Lys Phe Lys Val Thr Leu Asn Ser Ala 
545 550 ' 555 560 

GCC TTG TTA CAA CGA TAT CGT GTA AGA ATA CGC TAT GCT TCT ACC ACT 1728 
Ala Leu Leu Gin Arg Tyr Arg Val Arg lie Arg Tyr Ala Ser Thr Thr 
565 570 575 

AAC TTA CGA CTT TTT GTG CAA AAT TCA AAC AAT GAT TTT CTT GTC ATC 1776 
Asn Leu Arg Leu Phe Val Gin Asn Ser Asn Asn Asp Phe Leu Val lie 
580 585 590 

TAC ATT AAT AAA ACT ATG AAT AAA GAT GAT GAT TTA ACA TAT CAA ACA 1824 
Tyr lie Asn Lys Thr Met Asn Lys Asp Asp Asp Leu Thr Tyr Gin Thr 
595 600 605 

TTT GAT CTC GCA ACT ACT AAT TCT AAT ATG GGG TTC TCG GGT GAT AAG 1872 
Phe Asp Leu Ala Thr Thr Asn Ser Asn Met Gly Phe Ser Gly Asp Lys 
610 615 620 

AAT GAA CTT ATA ATA GGA GCA GAA TCT TTC GTT TCT AAT GAA AAA ATC 1920 
Asn Glu Leu lie lie Gly Ala Glu Ser Phe Val Ser Asn Glu Lys lie 
625 630 635 640 

TAT ATA GAT AAG ATA GAA TTT ATC CCA GTA CAA TTG TAA 1959 
Tyr lie Asp Lys lie Glu Phe lie Pro Val Gin Leu 
645 650 



(2) INFORMATION FOR SEQ ID NO: 98: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 652 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 98: 

Met Asn Pro Asn Asn Arg Ser Glu His Asp Thr lie Lys Val Thr Pro 
15 10 15 

Asn Ser Glu Leu Gin Thr Asn His Asn Gin Tyr Pro Leu Ala Asp Asn 
20 25 30 
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Pro Asn Ser Thr Leu Glu Glu Leu Asn Tyr Lys Glu Phe Leu Arg Met 
35 40 45 



Thr Glu Asp Ser Ser Thr Glu Val Leu Asp Asn Ser Thr Val Lys Asp 
50 55 60 

Ala Val Gly Thr Gly He Ser Val Val Gly Gin He Leu Gly Val Val 
65 70 75 80 

Gly Val Pro Phe Ala Gly Ala Leu Thr Ser Phe Tyr Gin Ser Phe Leu 
85 90 95 

Asn Thr He Trp Pro Ser Asp Ala Asp Pro Trp Lys Ala Phe Met Ala 
100 105 110 

Gin Val Glu Val Leu lie Asp Lys Lys He Glu Glu Tyr Ala Lys Ser 
115 120 125 

Lys Ala Leu Ala Glu Leu Gin Gly Leu Gin Asn Asn Phe Glu Asp Tyr 
130 135 140 

Val Asn Ala Leu Asn Ser Trp Lys Lys Thr Pro Leu Ser Leu Arg Ser 
145 150 155 160 

Lys Arg Ser Gin Asp Arg He Arg Glu Leu Phe Ser Gin Ala Glu Ser 
165 170 175 

His Phe Arg Asn Ser Met Pro Ser Phe Ala Val Ser Lys Phe Glu Val 
180 185 190 

Leu Phe Leu Pro Thr Tyr Ala Gin Ala Ala Asn Thr His Leu Leu Leu 
195 200 205 

Leu Lys Asp Ala Gin Val Phe Gly Glu Glu Trp Gly Tyr Ser Ser Glu 
210 215 220 

Asp Val Ala Glu Phe Tyr His Arg Gin Leu Lys Leu Thr Gin Gin Tyr 
225 230 235 240 

Thr Asp His Cys Val Asn Trp Tyr Asn Val Gly Leu Asn Gly Leu Arg 
245 250 255 

Gly Ser Thr Tyr Asp Ala Trp Val Lys Phe Asn Arg Phe Arg Arg Glu 
260 265 270 

Met Thr Leu Thr Val Leu Asp Leu lie Val Leu Phe Pro Phe Tyr Asp 
275 280 285 

He Arg Leu Tyr Ser Lys Gly Val Lys Thr Glu Leu Thr Arg Asp He 
290 295 300 

Phe Thr Asp Pro He Phe Ser Leu Asn Thr Leu Gin Glu Tyr Gly Pro 
305 310 315 320 
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Thr Phe Leu Ser lie Glu Asn Ser lie Arg Lys Pro His Leu Phe Asp 
325 330 335 



Tyr Leu Gin Gly lie Glu Phe His Thr Arg Leu Gin Pro Gly Tyr Phe 
340 345 350 

Gly Lys Asp Ser Phe Asn Tyr Trp Ser Gly Asn Tyr Val Glu Thr Arg 
355 360 365 

Pro Ser lie Gly Ser Ser Lys Thr lie Thr Ser Pro Phe Tyr Gly Asp 
370 375 380 

Lys Ser Thr Glu Pro Val Gin Lys Leu Ser Phe Asp Gly Gin Lys Val 
385 390 395 400 

Tyr Arg Thr lie Ala Asn Thr Asp Val Ala Ala Trp Pro Asn Gly Lys 
405 410 415 

Val Tyr Leu Gly Val Thr Lys Val Asp Phe Ser Gin Tyr Asp Asp Gin 
420 425 430 

Lys Asn Glu Thr Ser Thr Gin Thr Tyr Asp Ser Lys Arg Asn Asn Gly 
435 440 445 

His Val Ser Ala Gin Asp Ser lie Asp Gin Leu Pro Pro Glu Thr Thr 
450 455 460 

Asp Glu Pro Leu Glu Lys Ala Tyr Ser His Gin Leu Asn Tyr Ala Glu 
465 470 475 , 480 

Cys Phe Leu Met Gin Asp Arg Arg Gly Thr lie Pro Phe Phe Thr Trp 
485 490 495 

Thr His Arg Ser Val Asp Phe Phe Asn Thr lie Asp Ala Glu Lys lie 
500 505 510 

Thr Gin Leu Pro Val Val Lys Ala Tyr Ala Leu Ser Ser Gly Ala Ser 
515 520 525 

lie lie Glu Gly Pro Gly Phe Thr Gly Gly Asn Leu Leu Phe Leu Lys 
530 535 540 

Glu Ser Ser Asn Ser lie Ala Lys Phe Lys Val Thr Leu Asn Ser Ala 
545 550 555 560 

Ala Leu Leu Gin Arg Tyr Arg Val Arg lie Arg Tyr Ala Ser Thr Thr 
565 570 575 

Asn Leu Arg Leu Phe Val Gin Asn Ser Asn Asn Asp Phe Leu Val lie 
580 585 590 

Tyr lie Asn Lys Thr Met Asn Lys Asp Asp Asp Leu Thr Tyr Gin Thr 
595 600 605 
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Phe Asp Leu Ala Thr Thr Asn Ser Asn Met Gly Phe Ser Gly Asp Lys 
610 615 620 

Asn Glu Leu lie lie Gly Ala Glu Ser Phe Val Ser Asn Glu Lys lie 
625 630 635 640 

Tyr lie Asp Lys lie Glu Phe lie Pro Val Gin Leu 
645 650 

(2) INFORMATION FOR SEQ ID NO: 99: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2000 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:99: 

CCATCCATGG CAAACCCTAA CAATCGTTCC GAACACGACA CCATCAAGGT TACTCCAAAC 6 0 

TCTGAGTTGC AAACTAATCA CAACCAGTAC CCATTGGCTG ACAATCCTAA CAGTACTCTT 12 0 

GAGGAACTTA AC T AC AAGG A GTTTCTCCGG ATGACCGAAG ATAGCTCCAC TGAGGTTCTC 18 0 

GATAACTCTA CAGTGAAGGA CGCTGTTGGA ACTGGCATTA GCGTTGTGGG ACAGATTCTT 24 0 

GGAGTGGTTG GTGTTCCATT CGCTGGAGCT TTGACCAGCT TCTACCAGTC CTTTCTCAAC 300 

ACCATCTGGC CTTCAGATGC TGATCCCTGG AAGGCTTTCA TGGCCCAAGT GGAAGTCTTG 360 

ATCGATAAGA AGATCGAAGA GTATGCCAAG TCTAAAGCCT TGGCTGAGTT GCAAGGTTTG 420 

CAGAACAACT TCGAGGATTA CGTCAACGCA CTCAACAGCT GGAAGAAAAC TCCCTTGAGT 480 

CTCAGGTCTA AGCGTTCCCA GGACCGTATT CGTGAACTTT TCAGCCAAGC CGAATCCCAC 54 0 

TTCAGAAACT CCATGCCTAG CTTTGCCGTT TCTAAGTTCG AGGTGCTCTT CTTGCCAACA 600 

TACGCACAAG CTGCCAACAC TCATCTCTTG CTTCTCAAAG ACGCTCAGGT GTTTGGTGAG 66 0 

GAATGGGGTT ACTCCAGTGA AGATGTTGCC GAGTTCTACC GTAGGCAGCT CAAGTTGACT 720 

CAACAGTACA CAGACCACTG CGTCAACTGG TACAACGTTG GGCTCAATGG TCTTAGAGGA 78 0 

TCTACCTACG ACGCATGGGT GAAGTTCAAC AGGTTTCGTA GAGAGATGAC CTTGACTGTG 84 0 

CTCGATCTTA TCGTTCTCTT TCCATTCTAC GACATTCGTC TTTACTCCAA AGGCGTTAAG 900 

ACAGAGCTGA CCAGAGACAT CTTCACCGAT CCCATCTTCC TACTTACGAC CCTGCAGAAA 96 0 

TACGGTCCAA CTTTTCTCTC CATTGAGAAC AGCATCAGGA AGCCTCACCT CTTCGACTAT 1020 
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CTGCAAGGCA TTGAGTTTCA CACCAGGTTG CAACCTGGTT ACTTCGGTAA GGATTCCTTC 1080 

AACTACTGGA GCGGAAACTA CGTTGAAACC AGACCATCCA TCGGATCTAG C AAG AC CATC 114 0 

ACTTCTCCAT TCTACGGTGA CAAGAGCACT GAGCCAGTGC AGAAGTTGAG CTTCGATGGG 1200 

CAGAAGGTGT ATAGAACCAT CGCCAATACC GATGTTGCAG CTTGGCCTAA TGGCAAGGTC 1260 

TACCTTGGAG TTACTAAAGT GGACTTCTCC CAATACGACG AT C AG AAG AA CGAGACATCT 1320 

ACTCAAACCT ACGATAGTAA GAGGAACAAT GGCCATGTTT CCGCACAAGA CTCCATTGAC .1380 

CAACTTCCAC CTGAAACCAC TGATGAACCA TTGGAGAAGG CTTACAGTCA CCAACTTAAC 144 0 

TACGCCGAAT GCTTTCTCAT GCAAGACAGG CGTGGCACCA TTCCGTTCTT TACATGGACT 1500 

CACAGGTCTG TCGACTTCTT TAACACTATC GACGCTGAGA AGATTACCCA ACTTCCCGTG 1560 

GTCAAGGCTT ATGCCTTGTC CAGCGGAGCT TCCATCATTG AAGGTCCAGG CTTCACCGGT 16 20 

GGCAACTTGC TCTTCCTTAA GGAGTCCAGC AACTCCATCG CCAAGTTCAA AGTGACACTT 1680 

AACTCAGCAG CCTTGCTCCA ACGTTACAGG GTTCGTATCA GATACGCAAG CACTACCAAT 174 0 

CTTCGCCTCT TTGTCCAGAA CAGCAACAAT GATTTCCTTG TCATCTACAT CAACAAGACT 1800 

ATGAACAAAG ACGATGACCT CACCTACCAA ACATTCGATC TTGCCACTAC CAATAGTAAC 1860 

ATGGGATTCT CTGGTGACAA GAACGAGCTG ATCATAGGTG CTGAGAGCTT TGTCTCTAAT 1920 

GAGAAGATTT ACATAGACAA GATCGAGTTC ATTCCAGTTC AACTCTAATA GATCCCCCGG 1980 

GCTGCAGGAA TTCGATATCA 2000 



(2) INFORMATION FOR SEQ ID NO: 100: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 653 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 100: 

Met Ala Asn Pro Asn Asn Arg Ser Glu His Asp Thr lie Lys Val Thr 
15 10 15 

Pro Asn Ser Glu Leu Gin Thr Asn His Asn Gin Tyr Pro Leu Ala Asp 
20 25 30 



Asn Pro Asn Ser Thr Leu Glu Glu Leu Asn Tyr Lys Glu Phe Leu Arg 
35 40 45 
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Met Thr Glu Asp Ser Ser Thr Glu Val Leu Asp Asn Ser Thr Val Lys 
50 55 60 



Asp Ala Val Gly Thr Gly He Ser Val Val Gly Gin He Leu Gly Val 
65 70 75 80 

Val Gly Val Pro Phe Ala Gly Ala Leu Thr Ser Phe Tyr Gin Ser Phe 
85 90 95 

Leu Asn Thr He Trp Pro Ser Asp Ala Asp Pro Trp Lys Ala Phe Met 
100 105 110 

Ala Gin Val Glu Val Leu lie Asp Lys Lys He Glu Glu Tyr Ala Lys 
115 120 125 

Ser Lys Ala Leu Ala Glu Leu Gin Gly Leu Gin Asn Asn Phe Glu Asp 
130 135 140 

Tyr Val Asn Ala Leu Asn Ser Trp Lys Lys Thr Pro Leu Ser Leu Arg 
145 150 155 160 

Ser Lys Arg Ser Gin Asp Arg He Arg Glu Leu Phe Ser Gin Ala Glu 
165 170 175 

Ser His Phe Arg Asn Ser Met Pro Ser Phe Ala Val Ser Lys Phe Glu 
180 185 190 

Val Leu Phe Leu Pro Thr Tyr Ala Gin Ala Ala Asn Thr His Leu Leu 
195 200 205 

Leu Leu Lys Asp Ala Gin Val Phe Gly Glu Glu Trp Gly Tyr Ser Ser 
210 215 220 

Glu Asp Val Ala Glu Phe Tyr Arg Arg Gin Leu Lys Leu Thr Gin Gin 
225 230 235 240 

Tyr Thr Asp His Cys Val Asn Trp Tyr Asn Val Gly Leu Asn Gly Leu 
245 250 255 

Arg Gly Ser Thr Tyr Asp Ala Trp Val Lys Phe Asn Arg Phe Arg Arg 
260 265 270 

Glu Met Thr Leu Thr Val Leu Asp Leu He Val Leu Phe Pro Phe Tyr 
275 280 285 

Asp He Arg Leu Tyr Ser Lys Gly Val Lys Thr Glu Leu Thr Arg Asp 
290 295 300 

He Phe Thr Asp Pro He Phe Leu Leu Thr Thr Leu Gin Lys Tyr Gly 
305 310 315 320 

Pro Thr Phe Leu Ser He Glu Asn Ser He Arg Lys Pro His Leu Phe 
325 330 335 
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Asp Tyr Leu Gin Gly lie Glu Phe His Thr Arg Leu Gin Pro Gly Tyr 
340 345 350 



Phe Gly Lys Asp Ser Phe Asn Tyr Trp Ser Gly Asn Tyr Val Glu Thr 
355 360 365 

Arg Pro Ser lie Gly Ser Ser Lys Thr lie Thr Ser Pro Phe Tyr Gly 
370 375 380 

Asp Lys Ser Thr Glu Pro Val Gin Lys Leu Ser Phe Asp Gly Gin Lys 
385 390 395 400 

Val Tyr Arg Thr lie Ala Asn Thr Asp Val Ala Ala Trp Pro Asn Gly 
405 410 415 

Lys Val Tyr Leu Gly Val Thr Lys Val Asp Phe Ser Gin Tyr Asp Asp 
420 425 430 

Gin Lys Asn Glu Thr Ser Thr Gin Thr Tyr Asp Ser Lys Arg Asn Asn 
435 440 445 

Gly His Val Ser Ala Gin Asp Ser lie Asp Gin Leu Pro Pro Glu Thr 
450 455 460 

Thr Asp Glu Pro Leu Glu Lys Ala Tyr Ser His Gin Leu Asn Tyr Ala 
465 470 475 480 

Glu Cys Phe Leu Met Gin Asp Arg Arg Gly Thr lie Pro Phe Phe Thr 
485 490 495 

Trp Thr His Arg Ser Val Asp Phe Phe Asn Thr lie Asp Ala Glu Lys 
500 505 510 

lie Thr Gin Leu Pro Val Val Lys Ala Tyr Ala Leu Ser Ser Gly Ala 
515 520 525 

Ser lie lie Glu Gly Pro Gly Phe Thr Gly Gly Asn Leu Leu Phe Leu 
530 535 540 

Lys Glu Ser Ser Asn Ser lie Ala Lys Phe Lys Val Thr Leu Asn Ser 
545 550 555 560 

Ala Ala Leu Leu Gin Arg Tyr Arg Val Arg lie Arg Tyr Ala Ser Thr 
565 570 575 

Thr Asn Leu Arg Leu Phe Val Gin Asn Ser Asn Asn Asp Phe Leu Val 
580 585 590 

lie Tyr lie Asn Lys Thr Met Asn Lys Asp Asp Asp Leu Thr Tyr Gin 
595 600 605 

Thr Phe Asp Leu Ala Thr Thr Asn Ser Asn Met Gly Phe Ser Gly Asp 
610 615 620 
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Lys Asn Glu Leu lie lie Gly Ala Glu Ser Phe Val Ser Asn Glu Lys 
625 630 635 640 

He Tyr He Asp Lys He Glu Phe He Pro Val Gin Leu 
645 650 

(2) INFORMATION FOR SEQ ID NO: 101: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2050 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 101: 

TGGAGCTCCA CCGCGGTGGC GGCCGCTCTA GAACTAGTGG ATCTAGGCCT CCATATGAAC 6 0 

CCTAACAATC GTTCCGAACA CGACACCATC AAGGTTACTC CAAACTCTGA GTTGCAAACT 12 0 

AATCACAACC AGTACCCATT GGCTGACAAT CCTAACAGTA CTCTTGAGGA ACTTAACTAC 180 

AAGGAGTTTC TCCGGATGAC CGAAGATAGC TCCACTGAGG TTCTCGATAA CTCTACAGTG 240 

AAGGACGCTG TTGGAACTGG CATTAGCGTT GTGGGACAGA TTCTTGGAGT GGTTGGTGTT 3 00 

CCATTCGCTG GAGCTTTGAC CAGCTTCTAC CAGTCCTTTC TCAACACCAT CTGGCCTTCA 360 

GATGCTGATC CCTGGAAGGC TTTCATGGCC CAAGTGGAAG TCTTGATCGA TAAGAAGATC 420 

GAAGAGTATG CCAAGTCTAA AGCCTTGGCT GAGTTGCAAG GTTTGCAGAA CAACTTCGAG 480 

GATTACGTCA ACGCACTCAA CAGCTGGAAG AAAACTCCCT TGAGTCTCAG GTCTAAGCGT 54 0 

TCCCAGGACC GTATTCGTGA ACTTTTCAGC CAAGCCGAAT CCCACTTCAG AAACTCCATG 6 00 

CCTAGCTTTG CCGTTTCTAA GTTCGAGGTG CTCTTCTTGC CAACATACGC ACAAGCTGCC 660 

AACACTCATC TCTTGCTTCT CAAAGACGCT CAGGTGTTTG GTGAGGAATG GGGTTACTCC 720 

AGTGAAGATG TTGCCGAGTT CTACCATAGG CAGCTCAAGT TGACTCAACA GTACACAGAC 780 

CACTGCGTCA ACTGGTACAA CGTTGGGCTC AATGGTCTTA GAGGATCTAC CTACGACGCA 840 

TGGGTGAAGT TCAACAGGTT TCGTAGAGAG ATGACCTTGA CTGTGCTCGA TCTTATCGTT 900 

CTCTTTCCAT TCTACGACAT TCGTCTTTAC TCCAAAGGCG TTAAGACAGA GCTGACCAGA 96 0 

GACATCTTCA CCGATCCCAT CTTCTCACTT AACACCCTGC AGGAATACGG TCCAACTTTT 1020 

CTCTCCATTG AGAACAGCAT CAGGAAGCCT CACCTCTTCG ACTATCTGCA AGGCATTGAG 1080 

TTTCACACCA GGTTGCAACC TGGTTACTTC GGTAAGGATT CCTTCAACTA CTGGAGCGGA 1140 
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AACTACGTTG AAACCAGACC ATCCATCGGA TCTAGCAAGA CCATCACTTC TCCATTCTAC 12 00 

GGTGACAAGA GCACTGAGCC AGTGCAGAAG TTGAGCTTCG ATGGGCAGAA GGTGTATAGA 126 0 

ACCATCGCCA ATACCGATGT TGCAGCTTGG CCTAATGGCA AGGTCTACCT TGGAGTTACT 1320 

AAAGTGGACT TCTCCCAATA CGACGATCAG AAGAACGAGA CATCTACTCA AACCTACGAT 13 80 

AGTAAGAGGA ACAATGGCCA TGTTTCCGCA CAAGACTCCA TTGACCAACT TCCACCTGAA 144 0 

ACCACTGATG AAC C ATTGG A GAAGGCTTAC AGTCACCAAC TTAACTACGC CGAATGCTTT 1500 

CTCATGCAAG ACAGGCGTGG CACCATTCCG TTCTTTACAT GGACTCACAG GTCTGTCGAC 1560 

TTCTTTAACA CTATCGACGC TGAGAAGATT ACCCAACTTC CCGTGGTCAA GGCTTATGCC 1620 

TTGTCCAGCG GAGCTTCCAT CATTGAAGGT CCAGGCTTCA CCGGTGGCAA CTTGCTCTTC 1680 

CTTAAGGAGT CCAGCAACTC CATCGCCAAG TTCAAAGTGA CACTTAACTC AGCAGCCTTG 174 0 

CTCCAACGTT ACAGGGTTCG TATCAGATAC GCAAGCACTA CCAATCTTCG CCTCTTTGTC 18 00 

CAGAACAGCA ACAATGATTT CCTTGTCATC TACATCAACA AGACTATGAA CAAAGACGAT 1860 

GACCTCACCT ACAACACATT CGATCTTGCC ACTACCAATA GTAACATGGG ATTCTCTGGT 192 0 

GACAAGAACG AGCTGATCAT AGGTGCTGAG AGCTTTGTCT CTAATGAGAA GATTTACATA 1980 

GACAAGATCG AGTTCATTCC AGTTCAACTC TAATAGATCC CCCGGGCTGC AGGAATTCGA 204 0 

TATCAAGCTT 2 0 50 

(2) INFORMATION FOR SEQ ID NO: 102: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2280 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 102: 

TTAAAATTAA TTTTGTATAC TTTTCATTGT AATAATATGA TTTTAAAAAC GAAAAAGTGC 6 0 

ATATACAACT TATCAGGAGG GGGGGGATGC ACAAAGAAGA AAAGAATAAG AAGTGAATGT 12 0 

TTATAATGTT CAATAGTTTT ATGGGAAGGC ATTTTATCAG GTAGAAAGTT ATGTATTATG 18 0 

ATAAGAATGG GAGGAAGAAA AATGAATCCA AACAATCGAA GTGAACATGA TACGATAAAG 24 0 

GTTACACCTA ACAGTGAATT GCAAACTAAC CATAATCAAT ATCCTTTAGC TGACAATCCA 300 
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AATTCAACAC TAGAAGAATT AAATTATAAA GAATTTTTAA GAATGACTGA AGACAGTTCT 360 

ACGGAAGTGC TAGACAACTC TACAGTAAAA GATGCAGTTG GGACAGGAAT TTCTGTTGTA 42 0 

GGGCAGATTT TAGGTGTTGT AGGAGTTCCA TTTGCTGGGG CACTCACTTC ATTTTATCAA 4 80 

TCATTTCTTA ACACTATATG GCCAAGTGAT GCTGACCCAT GGAAGGCTTT TATGGCACAA 54 0 

GTTGAAGTAC TGATAGATAA GAAAATAGAG GAGTATGCTA AAAGTAAAGC TCTTGCAGAG 6 00 

TTACAGGGTC TTCAAAATAA TTTCGAAGAT TATGTTAATG CGTTAAATTC CTGGAAGAAA . 66 0 

ACACCTTTAA GTTTGCGAAG TAAAAGAAGC CAAGATCGAA TAAGGGAACT TTTTTCTCAA 72 0 

GCAGAAAGTC ATTTTCGTAA TTCCATGCCG TCATTTGCAG TTTCCAAATT CGAAGTGCTG 7 80 

TTTCTACCAA CATATGCACA AGCTGCAAAT ACACATTTAT TGCTATTAAA AGATGCTCAA 84 0 

GTTTTTGGAG AAGAATGGGG ATATTCTTCA GAAGATGTTG CTGAATTTTA T CAT AG AC AA 900 

TTAAAACTTA CACAACAATA CACTGACCAT TGTGTTAATT GGTATAATGT TGGATTAAAT 96 0 

GGTTTAAGAG GTTCAACTTA TGATGCATGG GTCAAATTTA ACCGTTTTCG CAGAGAAATG 102 0 

ACTTTAACTG TATTAGATCT AATTGTACTT TTCCCATTTT ATGATATTCG GTTATACTCA 108 0 

AAAGGGGTTA AAACAGAACT AACAAGAGAC ATTTTTACGG ATCCAATTTT TTCACTTAAT 114 0 

ACTCTTCAGG AGTATGGACC AACTTTTTTG AGTATAGAAA ACTCTATTCG AAAACCTCAT 1200 

TTATTTGATT ATTTACAGGG GATTGAATTT CATACGCGTC TTCAACCTGG TTACTTTGGG 1260 

AAAGATTCTT TCAATTATTG GTCTGGTAAT TATGTAGAAA CTAGACCTAG TATAGGATCT 132 0 

AGTAAGACAA TTACTTCCCC ATTTTATGGA GATAAATCTA CTGAACCTGT ACAAAAGCTA 1380 

AGCTTTGATG GACAAAAAGT TTATCGAACT ATAGCTAATA CAGACGTAGC GGCTTGGCCG 1440 

AATGGTAAGG TATATTTAGG TGTTACGAAA GTTGATTTTA GTCAATATGA TGATCAAAAA 1500 

AATGAAACTA GTACACAAAC ATATGATTCA AAAAGAAACA ATGGCCATGT AAGTGCACAG 1560 

GATTCTATTG ACCAATTACC GCCAGAAACA ACAGATGAAC CACTTGAAAA AGCATATAGT 1620 

CATCAGCTTA ATTACGCGGA ATGTTTCTTA ATGCAGGACC GTCGTGGAAC AATTCCATTT 16 80 

TTTACTTGGA CACATAGAAG TGTAGACTTT TTTAATACAA TTGATGCTGA AAAGATTACT 17 40 

CAACTTCCAG TAGTGAAAGC ATATGCCTTG TCTTCAGGTG CTTCCATTAT TGAAGGTCCA 1800 

GGATTCACAG GAGGAAATTT ACTATTCCTA AAAGAATCTA GTAATTCAAT TGCTAAATTT 186 0 

AAAGTTACAT TAAATTCAGC AGCCTTGTTA CAACGATATC GTGTAAGAAT ACGCTATGCT 1920 
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TCTACCACTA ACTTACGACT TTTTGTGCAA AATTCAAACA ATGATTTTCT TGTCATCTAC 198 0 

ATTAATAAAA CTATGAATAA AGATGATGAT TTAACATATC AAACATTTGA TCTCGCAACT 204 0 

ACTAATTCTA ATATGGGGTT CTCGGGTGAT AAGAATGAAC TTATAATAGG AGCAGAATCT 2100 

TTCGTTTCTA ATGAAAAAAT CTATATAGAT AAGATAGAAT TTATCCCAGT ACAATTGTAA 2160 

GGAGATTTTA AAATGTTGGG TGATGGTCAA AATGAAAGAA TAGGAAGGTG AATTTTGATG 2220 

GTTAGGAAAG ATTCTTTTAA CAAAAGCAAC ATGGAAAAGT ATACAGTACA AATATTAACC 2 2 80 

(2) INFORMATION FOR SEQ ID NO:103: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 32 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 103: 
TAGGCCTCCA TCCATGGCAA ACCCTAACAA TC 32 

(2) INFORMATION FOR SEQ ID NO:104: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 42 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 104: 
TCCCATCTTC CTACTTACGA CCCTGCAGAA ATACGGTC C A AC 42 

(2) INFORMATION FOR SEQ ID NO: 105: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 28 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 105: 
GACCTCACCT ACCAAACATT CGATCTTG 28 
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(2) INFORMATION FOR SEQ ID NO: 106: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 106: 



CGAGTTCTAC CGTAGGCAGC TCAAG 



(2) INFORMATION FOR SEQ ID NO: 107: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1959 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



25 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 107: 

ATGAATCCAA ACAATCGAAG TGAACATGAT ACGATAAAGG TTACACCTAA CAGTGAATTG 60 

CAAACTAACC ATAATCAATA TCCTTTAGCT GACAATCCAA ATTCAACACT AGAAGAATTA 120 

AATTATAAAG AATTTTTAAG AATGACTGAA GACAGTTCTA CGGAAGTGCT AGACAACTCT 180 

ACAGTAAAAG ATGCAGTTGG GACAGGAATT TCTGTTGTAG GGCAGATTTT AGGTGTTGTA 24 0 

GGAGTTCCAT TTGCTGGGGC ACTCACTTCA TTTTATCAAT CATTTCTTAA CACTATATGG 3 00 

CCAAGTGATG CTGACCCATG GAAGGCTTTT ATGGCACAAG TTGAAGTACT GATAGATAAG 360 

AAAATAGAGG AGTATGCTAA AAGTAAAGCT CTTGCAGAGT TACAGGGTCT TCAAAATAAT 420 

TTCGAAGATT ATGTTAATGC GTTAAATTCC TGGAAGAAAA CACCTTTAAG TTTGCGAAGT 480 

AAAAGAAGCC AAGGTCGAAT AAGGGAACTT TTTTCTCAAG CAGAAAGTCA TTTTCGTAAT 540 

TCCATGCCGT CATTTGCAGT TTCCAAATTC GAAGTGCTGT TTCTACCAAC ATATGCACAA 600 

GCTGCAAATA CACATTTATT GCTATTAAAA GATGCTCAAG TTTTTGGAGA AGAATGGGGA 660 

TATTCTTCAG AAGATGTTGC TGAATTCTAT CGTAGACAAT TAAAACTTAC ACAACAATAC 720 

ACTGACCATT GTGTTAATTG GTATAATGTT GGATTAAATG GTTTAAGAGG TTCAACTTAT 780 

GATGCATGGG TCAAATTTAA CCGTTTTCGC AGAGAAATGA CTTTAACTGT ATTAGATCTA 840 

ATTGTACTTT TCCCATTTTA TGATATTCGG TTATACTCAA AAGGGGTTAA AACAGAACTA 900 

ACAAGAGACA TTTTTACGGA TCCAATTTTT TTACTTACTA CGCTTCAGAA GTACGGACCA 960 
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ACTTTTTTGA GTATAGAAAA CTCTATTCGA AAACCTCATT TATTTGATTA TTTACAGGGG 102 0 

ATTGAATTTC ATACGCGTCT TCAACCTGGT TACTTTGGGA AAGATTCTTT CAATTATTGG 1080 

TCTGGTAATT ATGTAGAAAC TAGACCTAGT ATAGGATCTA GTAAGACAAT TACTTCCCCA 114 0 

TTTTATGGAG ATAAATCTAC TGAACCTGTA CAAAAGCTAA GCTTTGATGG ACAAAAAGTT 1200 

TATCGAACTA TAGCTAATAC AGACGTAGCG GCTTGGCCGA ATGGTAAGGT ATATTTAGGT 1260 

GTTACGAAAG TTGATTTTAG TCAATATGAT GATCAAAAAA ATGAAACTAG TACACAAACA 1320 

TATGATTCAA AAAGAAACAA TGGCCATGTA AGTGCACAGG ATTCTATTGA CCAATTACCG 1380 

CCAGAAACAA CAGATGAACC ACTTGAAAAA GCATATAGTC ATCAGCTTAA TTACGCGGAA 1440 

TGTTTCTTAA TGCAGGACCG TCGTGGAACA ATTCCATTTT TTACTTGGAC ACATAGAAGT 1500 

GTAGACTTTT TTAATACAAT TGATGCTGAA AAGATTACTC AACTTCCAGT AGTGAAAGCA 1560 

TATGCCTTGT CTTCAGGTGC TTCCATTATT GAAGGTCCAG GATTCACAGG AGGAAATTTA 1620 

CTATTCCTAA AAGAATCTAG TAATTCAATT GCTAAATTTA AAGTTACATT AAATTCAGCA 1680 

GCCTTGTTAC AACGATATCG TGTAAGAATA CGCTATGCTT CTACCACTAA CTTACGACTT 1740 

TTTGTGCAAA ATTCAAACAA TGATTTTCTT GTCATCTACA TTAATAAAAC TATGAATAAA 1800 

GATGATGATT TAACATATCA AACATTTGAT CTCGCAACTA CTAATTCTAA TATGGGGTTC 1860 

TCGGGTGATA AGAATGAACT TATAATAGGA GCAGAATCTT TCGTTTCTAA TGAAAAAATC 1920 

TATATAGATA AGATAGAATT TATCCCAGTA CAATTGTAA 1959 

(2) INFORMATION FOR SEQ ID NO: 108: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 652 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 108: 

Met Asn Pro Asn Asn Arg Ser Glu His Asp Thr lie Lys Val Thr Pro 
15 10 15 

Asn Ser Glu Leu Gin Thr Asn His Asn Gin Tyr Pro Leu Ala Asp Asn 
20 25 30 

Pro Asn Ser Thr Leu Glu Glu Leu Asn Tyr Lys Glu Phe Leu Arg Met 
35 40 45 
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Thr Glu Asp Ser Ser Thr Glu Val Leu Asp Asn Ser Thr Val Lys Asp 
50 55 60 



Ala Val Gly Thr Gly He Ser Val Val Gly Gin He Leu Gly Val Val 
65 70 75 80 

Gly Val Pro Phe Ala Gly Ala Leu Thr Ser Phe Tyr Gin Ser Phe Leu 
85 90 95 

Asn Thr He Trp Pro Ser Asp Ala Asp Pro Trp Lys Ala Phe Met Ala 
100 105 110 

Gin Val Glu Val Leu He Asp Lys Lys He Glu Glu Tyr Ala Lys Ser 
115 120 125 

Lys Ala Leu Ala Glu Leu Gin Gly Leu Gin Asn Asn Phe Glu Asp Tyr 
130 135 140 

Val Asn Ala Leu Asn Ser Trp Lys Lys Thr Pro Leu Ser Leu Arg Ser 
145 150 155 160 

Lys Arg Ser Gin Gly Arg He Arg Glu Leu Phe Ser Gin Ala Glu Ser 
165 170 175 

His Phe Arg Asn Ser Met Pro Ser Phe Ala Val Ser Lys Phe Glu Val 
180 185 190 

Leu Phe Leu Pro Thr Tyr Ala Gin Ala Ala Asn Thr His Leu Leu Leu 
195 200 205 

Leu Lys Asp Ala Gin Val Phe Gly Glu Glu Trp Gly Tyr Ser Ser Glu 
210 215 220 

Asp Val Ala Glu Phe Tyr Arg Arg Gin Leu Lys Leu Thr Gin Gin Tyr 
225 230 235 240 

Thr Asp His Cys Val Asn Trp Tyr Asn Val . Gly Leu Asn Gly Leu Arg 
245 250 255 

Gly Ser Thr Tyr Asp Ala Trp Val Lys Phe Asn Arg Phe Arg Arg Glu 
260 265 270 

Met Thr Leu Thr Val Leu Asp Leu He Val Leu Phe Pro Phe Tyr Asp 
275 280 285 

He Arg Leu Tyr Ser Lys Gly Val Lys Thr Glu Leu Thr 'Arg Asp He 
290 295 300 

Phe Thr Asp Pro lie Phe Leu Leu Thr Thr Leu Gin Lys Tyr Gly Pro 
305 310 315 320 

Thr Phe Leu Ser He Glu Asn Ser He Arg Lys Pro His Leu Phe Asp 
325 330 335 
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Tyr Leu Gin Gly lie Glu Phe His Thr Arg Leu Gin Pro Gly Tyr Phe 
340 345 350 



Gly Lys Asp Ser Phe Asn Tyr Trp Ser Gly Asn Tyr Val Glu Thr Arg 
355 360 365 

Pro Ser lie Gly Ser Ser Lys Thr lie Thr Ser Pro Phe Tyr Gly Asp 
370 375 380 

Lys Ser Thr Glu Pro Val Gin Lys Leu Ser Phe Asp Gly Gin Lys Val' 
385 390 395 400 

Tyr Arg Thr lie Ala Asn Thr Asp Val Ala Ala Trp Pro Asn Gly Lys 
405 410 415 

Val Tyr Leu Gly Val Thr Lys Val Asp Phe Ser Gin Tyr Asp Asp Gin 
420 425 430 

Lys Asn Glu Thr Ser Thr Gin Thr Tyr Asp Ser Lys Arg Asn Asn Gly 
435 440 445 

His Val Ser Ala Gin Asp Ser He Asp Gin Leu Pro Pro Glu Thr Thr 
450 455 460 

Asp Glu Pro Leu Glu Lys Ala Tyr Ser His Gin Leu Asn Tyr Ala Glu 
465 470 475 480 

Cys Phe Leu Met Gin Asp Arg Arg Gly Thr He Pro Phe Phe Thr Trp 
485 490 495 

Thr His Arg Ser Val Asp Phe Phe Asn Thr lie Asp Ala Glu Lys He 
500 505 510 

Thr Gin Leu Pro Val Val Lys Ala Tyr Ala Leu Ser Ser Gly Ala Ser 
515 520 525 

He He Glu Gly Pro Gly Phe Thr Gly Gly Asn Leu Leu Phe Leu Lys 
530 535 540 

Glu Ser Ser Asn Ser He Ala Lys Phe Lys Val Thr Leu Asn Ser Ala 
545 550 555 560 

Ala Leu Leu Gin Arg Tyr Arg Val Arg He Arg Tyr Ala Ser Thr Thr 
565 570 575 

Asn Leu Arg Leu Phe Val Gin Asn Ser Asn Asn Asp Phe Leu Val He 
580 585 590 

Tyr lie Asn Lys Thr Met Asn Lys Asp Asp Asp Leu Thr Tyr Gin Thr 
595 600 605 

Phe Asp Leu Ala Thr Thr Asn Ser Asn Met Gly Phe Ser Gly Asp Lys 
610 615 620 
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Asn Glu Leu lie lie Gly Ala Glu Ser Phe Val Ser Asn Glu Lys lie 
625 630 635 640 

Tyr lie Asp Lys lie Glu Phe lie Pro Val Gin Leu 
645 650 



2) INFORMATION FOR SEQ ID NO: 109: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 64 9 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 109: 

Met Asn Pro Asn Asn Arg Ser Glu His Asp Thr lie Lys Ala Thr Glu 
15 10 15 

Asn Asn Glu Val Ser Asn Asn His Ala Gin Tyr Pro Leu Ala Asp Thr 
20 25 30 

Pro Thr Leu Glu Glu Leu Asn Tyr Lys Glu Phe Leu Arg Arg Thr Thr 
35 40 45 

Asp Asn Asn Val Glu Ala Leu Asp Ser Ser Thr Thr Lys Asp Ala lie 
50 55 60 

Gin Lys Gly He Ser He He Gly Asp Leu Leu Gly Val Val Gly Phe 
65 70 75 80 

Pro Tyr Gly Gly Ala Leu Val Ser Phe Tyr Thr Asn Leu Leu Asn Thr 
85 90 95 

He Trp Pro Gly Glu Asp Pro Leu Lys Ala Phe Met Gin Gin Val Glu 
100 105 110 

Ala Leu He Asp Gin Lys lie Ala Asp Tyr Ala Lys Asp Lys Ala Thr 
115 120 125 

Ala Glu Leu Gin Gly Leu Lys Asn Val Phe Lys Asp Tyr Val Ser Ala 
130 135 140 

Leu Asp Ser Trp Asp Lys Thr Pro Leu Thr Leu Arg Asp Gly Arg Ser 
145 150 155 160 

Gin Gly Arg He Arg Glu Leu Phe Ser Gin Ala Glu Ser His Phe Arg 
165 170 175 

Arg Ser Met Pro Ser Phe Ala Val Ser Gly Tyr Glu Val Leu Phe Leu 
180 185 190 
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Cys Phe Leu Leu Gin Gly Gly Arg Gly lie lie Pro Val Phe Thr Trp 
485 490 495 



Thr His Lys Ser Val Asp Phe Tyr Asn Thr Leu Asp Ser Glu Lys lie 
500 505 510 

Thr Gin lie Pro Phe Val Lys Ala Phe lie Leu Val Asn Ser Thr Ser 
515 520 525 

Val Val Ala Gly Pro Gly Phe Thr Gly Gly Asp He He Lys Cys Thr 
530 535 540 

Asn Gly Ser Gly Leu Thr Leu Tyr Val Thr Pro Ala Pro Asp Leu Thr 
545 550 555 560 

Tyr Ser Lys Thr Tyr Lys He Arg He Arg Tyr Ala Ser Thr Ser Gin 
565 570 575 

Val Arg Phe Gly He Asp Leu Gly Ser Tyr Thr His Ser He Ser Tyr 
580 585 590 

Phe Asp Lys Thr Met Asp Lys Gly Asn Thr Leu Thr Tyr Asn Ser Phe 
595 600 605 

Asn Leu Ser Ser Val Ser Arg Pro lie Glu He Ser Gly Gly Asn Lys 
610 615 620 

He Gly Val Ser Val Gly Gly He Gly Ser Gly Asp Glu Val Tyr He 
625 630 635 640 



Asp Lys He Glu Phe He Pro Met Asp 
645 



(2) INFORMATION FOR SEQ ID NO: 110: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 652 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 110: 

Met Asn Pro Asn Asn Arg Ser Glu His Asp Thr lie Lys Val Thr Pro 
15 10 15 

Asn Ser Glu Leu Pro Thr Asn His Asn Gin Tyr Pro Leu Ala Asp Asn 
20 25 30 

Pro Asn Ser Thr Leu Glu Glu Leu Asn Tyr Lys Glu Phe Leu Arg Met 
35 40 45 
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Thr Glu Asp Ser Ser Thr Glu Val Leu Asp Asn Ser Thr Val Lys Asp 
50 55 60 



Ala Val Gly Thr Gly He Ser Val Val Gly Gin He Leu Gly Val Val 
65 70 75 80 

Gly Val Pro Phe Ala Gly Ala Leu Thr Ser Phe Tyr Gin Ser Phe Leu 
85 90 95 

Asp Thr He Trp Pro Ser Asp Ala Asp Pro Trp Lys Ala Phe Met Ala 
100 105 110 

Gin Val Glu Val Leu lie Asp Lys Lys He Glu Glu Tyr Ala Lys Ser 
115 120 125 

Lys Ala Leu Ala Glu Leu Gin Gly Leu Gin Asn Asn Phe Glu Asp Tyr 
130 135 140 

Val Asn Ala Leu Asn Ser Trp Lys Lys Thr Pro Leu Ser Leu Arg Ser 
145 150 155 160 

Lys Arg Ser Gin Asp Arg He Arg Glu Leu Phe Ser Gin Ala Glu Ser 
165 170 175 

His Phe Arg Asn Ser Met Pro Ser Phe Ala Val Ser Lys Phe Glu Val 
180 185 190 

Leu Phe Leu Pro Thr Tyr Ala Gin Ala Ala Asn Thr His Leu Leu Leu 
195 200 205 

Leu Lys Asp Ala Gin Val Phe Gly Glu Glu Trp Gly Tyr Ser Ser Glu 
210 215 220 

Asp Val Ala Glu Phe Tyr His Arg Gin Leu Lys Leu Thr Gin Gin Tyr 
225 230 235 240 

Thr Asp His Cys Val Asn Trp Tyr Asn Val Gly Leu Asn Gly Leu Arg 
245 250 255 

Gly Ser Thr Tyr Asp Ala Trp Val Lys Phe Asn Arg Phe Arg Arg Glu 
260 265 270 

Met Thr Leu Thr Val Leu Asp Leu He Val Leu Phe Pro Phe Tyr Asp 
275 280 285 

Val Arg Leu Tyr Ser Lys Gly Val Lys Thr Glu Leu Thr Arg Asp lie 
290 295 300 

Phe Thr Asp Pro lie Phe Ser Leu Asn Thr Leu Gin Glu Tyr Gly Pro 
305 310 315 320 

Thr Phe Leu Ser He Glu Asn Ser He Arg Lys Pro His Leu Phe Asp 
325 330 335 
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Tyr Leu Gin Gly lie Glu Phe His Thr Arg Leu Gin Pro Gly Tyr Ser 
340 345 350 



Gly Lys Asp Ser Phe Asn Tyr Trp Ser Gly Asn Tyr Val Glu Thr Arg 
355 360 365 

Pro Ser lie Gly Ser Ser Lys Thr lie Thr Ser Pro Phe Tyr Gly Asp 
370 375 380 

Lys Ser Thr Glu Pro Val Gin Lys Leu Ser Phe Asp Gly Gin Lys Val 
385 390 395 400. 

Tyr Arg Thr lie Ala Asn Thr Asp Val Ala Ala Trp Pro Asn Gly Lys 
405 410 415 

lie Tyr Phe Gly Val Thr Lys Val Asp Phe Ser Gin Tyr Asp Asp Gin 
420 425 430 

Lys Asn Glu Thr Ser Thr Gin Thr Tyr Asp Ser Lys Arg Asn Asn Gly 
435 440 445 

His Val Gly Ala Gin Asp Ser lie Asp Gin Leu Pro Pro Glu Thr Thr 
450 455 460 

Asp Glu Pro Leu Glu Lys Ala Tyr Ser His Gin Leu Asn Tyr Ala Glu 
465 470 475 480 

Cys Phe Leu Met Gin Asp Arg Arg Gly Thr lie Pro Phe Phe Thr Trp 
485 490 495 

Thr His Arg Ser Val Asp Phe Phe Asn Thr lie Asp Ala Glu Lys lie 
500 505 510 

Thr Gin Leu Pro Val Val Lys Ala Tyr Ala Leu Ser Ser Gly Ala Ser 
515 520 525 

lie lie Glu Gly Pro Gly Phe Thr Gly Gly Asn Leu Leu Phe Leu Lys 
530 535 540 

Glu Ser Ser Asn Ser lie Ala Lys Phe Lys Val Thr Leu Asn Ser Ala 
545 550 555 560 

Ala Leu Leu Gin Arg Tyr Arg Val Arg lie Arg Tyr Ala Ser Thr Thr 
565 570 575 

Asn Leu Arg Leu Phe Val Gin Asn Ser Asn Asn Asp Phe lie Val lie 
580 585 590 

Tyr lie Asn Lys Thr Met Asn lie Asp Asp Asp Leu Thr Tyr Gin Thr 
595 600 605 

Phe Asp Leu Ala Thr Thr Asn Ser Asn Met Gly Phe Ser Gly Asp Thr 
610 615 620 
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Asn Glu Leu lie lie Gly Ala Glu Ser Phe Val Ser Asn Glu Lys lie 
625 630 635 640 

Tyr lie Asp Lys lie Glu Phe lie Pro Val Gin Leu 
645 650 



(2) INFORMATION FOR SEQ ID NO: 111: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 652 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 111: 

Met Asn Pro Asn Asn Arg Ser Glu His Asp Thr lie Lys Val Thr Pro 
15 10 15 

Asn Ser Glu Leu Gin Thr Asn His Asn Gin Tyr Pro Leu Ala Asp Asn 
20 25 30 

Pro Asn Ser Thr Leu Glu Glu Leu Asn Tyr Lys Glu Phe Leu Arg Met 
35 40 45 

Thr Glu Asp Ser Ser Thr Glu Val Leu Asp Asn Ser Thr Val Lys Asp 
50 55 60 

Ala Val Gly Thr Gly He Ser Val Val Gly Gin He Leu Gly Val Val 
65 70 75 80 

Gly Val Pro Phe Ala Gly Ala Leu Thr Ser Phe Tyr Gin Ser Phe Leu 
85 90 95 

Asn Thr He Trp Pro Ser Asp Ala Asp Pro Trp Lys Ala Phe Met Ala 
100 105 110 

Gin Val Glu Val Leu He Asp Lys Lys He Glu Glu Tyr Ala Lys Ser 
115 120 125 

Lys Ala Leu Ala Glu Leu Gin Gly Leu Gin Asn Asn Phe Glu Asp Tyr 
130 135 140 

Val Asn Ala Leu Asn Ser Trp Lys Lys Thr Pro Leu Ser Leu Arg Ser 
145 150 155 160 

Lys Arg Ser Gin Asp Arg He Arg Glu Leu Phe Ser Gin Ala Glu Ser 
165 170 175 

His Phe Arg Asn Ser Met Pro Ser Phe Ala Val Ser Lys Phe Glu Val 
180 185 190 
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Leu Phe Leu Pro Thr Tyr Ala Gin Ala Ala Asn Thr His Leu Leu Leu 
195 200 205 



Leu Lys Asp Ala Gin Val Phe Gly Glu Glu Trp Gly Tyr Ser Ser Glu 
210 215 220 

Asp Val Ala Glu Phe Tyr His Arg Gin Leu Lys Leu Thr Gin Gin Tyr 
225 230 235 240 

Thr Asp His Cys Val Asn Trp Tyr Asn Val Gly Leu Asn Gly Leu Arg 
245 250 255 

Gly Ser Thr Tyr Asp Ala Trp Val Lys Phe Asn Arg Phe Arg Arg Glu 
260 265 270 

Met Thr Leu Thr Val Leu Asp Leu lie Val Leu Phe Pro Phe Tyr Asp 
275 280 285 

lie Arg Leu Tyr Ser Lys Gly Val Lys Thr Glu Leu Thr Arg Asp lie 
290 295 300 

Phe Thr Asp Pro lie Phe Ser Leu Asn Thr Leu Gin Glu Tyr Gly Pro 
305 310 315 320 

Thr Phe Leu Ser lie Glu Asn Ser lie Arg Lys Pro His Leu Phe Asp 
325 330 335 

Tyr Leu Gin Gly lie Glu Phe His Thr Arg Leu Gin Pro Gly Tyr Phe 
340 345 350 

Gly Lys Asp Ser Phe Asn Tyr Trp Ser Gly Asn Tyr Val Glu Thr Arg 
355 360 365 

Pro Ser lie Gly Ser Ser Lys Thr lie Thr Ser Pro Phe Tyr Gly Asp 
370 375 380 

Lys Ser Thr Glu Pro Val Gin Lys Leu Ser Phe Asp Gly Gin Lys Val 
385 390 395 400 

Tyr Arg Thr lie Ala Asn Thr Asp Val Ala Ala Trp Pro Asn Gly Lys 
405 410 415 

Val Tyr Leu Gly Val Thr Lys Val Asp Phe Ser Gin Tyr Asp Asp Gin 
420 425 430 

Lys Asn Glu Thr Ser Thr Gin Thr Tyr Asp Ser Lys Arg Asn Asn Gly 
435 440 445 

His Val Ser Ala Gin Asp Ser lie Asp Gin Leu Pro Pro Glu Thr Thr 
450 455 460 

Asp Glu Pro Leu Glu Lys Ala Tyr Ser His Gin Leu Asn Tyr Ala Glu 
465 470 475 480 
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Cys Phe Leu Met Gin Asp Arg Arg Gly Thr lie Pro Phe Phe Thr Trp 
485 490 495 



Thr His Arg Ser 
500 

Thr Gin Leu Pro 
515 

He He Glu Gly 
530 

Glu Ser Ser Asn 
545 

Ala Leu Leu Gin 



Asn Leu Arg Leu 
580 

Tyr He Asn Lys 
595 

Phe Asp Leu Ala 
610 

Asn Glu Leu He 
625 

Tyr lie Asp Lys 



Val Asp Phe Phe 



Val Val Lys Ala 
520 

Pro Gly Phe Thr 
535 

Ser He Ala Lys 
550 

Arg Tyr Arg Val 
565 

Phe Val Gin Asn 



Thr Met Asn Lys 
600 

Thr Thr Asn Ser 
615 

He Gly Ala Glu 
630 

He Glu Phe He 
645 



Asn Thr lie Asp 
505 

Tyr Ala Leu Ser 



Gly Gly Asn Leu 
540 

Phe Lys Val Thr 
555 

Arg He Arg Tyr 
570 

Ser Asn Asn Asp 
585 

Asp Asp Asp Leu 



Asn Met Gly Phe 
620 

Ser Phe Val Ser 
635 

Pro Val Gin Leu 
650 



Ala Glu Lys lie 
510 

Ser Gly Ala Ser 
525 

Leu Phe Leu Lys 



Leu Asn Ser Ala 
560 

Ala Ser Thr Thr 
575 

Phe Leu Val He 
590 

Thr Tyr Gin Thr 
605 

Ser Gly Asp Lys 



Asn Glu Lys He 
640 



(2) INFORMATION FOR SEQ ID NO: 112: 

( i ) S EQUENCE CHARACTER I S T I CS : 

(A) LENGTH: 659 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:112: 

Met He Arg Met Gly Gly Arg Lys Met Asn Pro Asn Asn Arg Ser Glu 
15 10 15 

Tyr Asp Thr He Lys Val Thr Pro Asn Ser Glu Leu Pro Thr Asn His 
20 25 30 

Asn Gin Tyr Pro Leu Ala Asp Asn Pro Asn Ser Thr Leu Glu Glu Leu 
35 40 45 
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Asn Tyr Lys Glu Phe Leu Arg Met Thr Ala Asp Asn Ser Thr Glu Val 
50 55 60 



Leu Asp Ser Ser Thr Val Lys Asp Ala Val Gly Thr Gly lie Ser Val 
65 70 75 80 

Val Gly Gin He Leu Gly Val Val Gly Val Pro Phe Ala Gly Ala Leu 
85 90 95 

Thr Ser Phe Tyr Gin Ser Phe Leu Asn Ala He Trp Pro Ser Asp Ala. 
100 105 110 

Asp Pro Trp Lys Ala Phe Met Ala Gin Val Glu Val Leu He Asp Lys 
115 120 125 

Lys He Glu Glu Tyr Ala Lys Ser Lys Ala Leu Ala Glu Leu Gin Gly 
130 135 140 

Leu Gin Asn Asn Phe Glu Asp Tyr Val Asn Ala Leu Asp Ser Trp Lys 
145 150 155 160 

Lys Ala Pro Val Asn Leu Arg Ser Arg Arg Ser Gin Asp Arg He Arg 
165 170 175 

Glu Leu Phe Ser Gin Ala Glu Ser His Phe Arg Asn Ser Met Pro Ser 
180 185 ' 190 

Phe Ala Val Ser Lys Phe Glu Val Leu Phe Leu Pro Thr Tyr Ala Gin 
195 200 205 

Ala Ala Asn Thr His Leu Leu Leu Leu Lys Asp Ala Gin Val Phe Gly 
210 215 220 

Glu Glu Trp Gly Tyr Ser Ser Glu Asp He Ala Glu Phe Tyr Gin Arg 
225 230 235 240 

Gin Leu Lys Leu Thr Gin Gin Tyr Thr Asp His Cys Val Asn Trp Tyr 
245 250 255 

Asn Val Gly Leu Asn Ser Leu Arg Gly Ser Thr Tyr Asp Ala Trp Val 
260 265 270 

Lys Phe Asn Arg Phe Arg Arg Glu Met Thr Leu Thr Val Leu Asp Leu 
275 280 285 

He Val Leu Phe Pro Phe Tyr Asp Val Arg Leu Tyr Ser Lys Gly Val 
290 295 300 

Lys Thr Glu Leu Thr Arg Asp He Phe Thr Asp Pro He Phe Thr Leu 
305 310 315 320 

Asn Ala Leu Gin Glu Tyr Gly Pro Thr Phe Ser Ser He Glu Asn Ser 
325 330 335 
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lie Arg 



Lys Pro His Leu Phe Asp Tyr Leu Arg Gly lie Glu Phe His 
340 345 350 



Thr Arg Leu Arg Pro Gly Tyr Ser Gly Lys Asp Ser Phe Asn Tyr Trp 
355 360 365 

Ser Gly Asn Tyr Val Glu Thr Arg Pro Ser lie Gly Ser Asn Asp Thr 
370 375 380 

lie Thr Ser Pro Phe Tyr Gly Asp Lys Ser lie Glu Pro lie Gin Lys 
385 390 395 400 

Leu Ser Phe Asp Gly Gin Lys Val Tyr Arg Thr lie Ala Asn Thr Asp 
405 410 415 

lie Ala Ala Phe Pro Asp Gly Lys lie Tyr Phe Gly Val Thr Lys Val 
420 425 430 

Asp Phe Ser Gin Tyr Asp Asp Gin Lys Asn Glu Thr Ser Thr Gin Thr 
435 440 445 

Tyr Asp Ser Lys Arg Tyr Asn Gly Tyr Leu Gly Ala Gin Asp Ser lie 
450 455 460 

Asp Gin Leu Pro Pro Glu Thr Thr Asp Glu Pro Leu Glu Lys Ala Tyr 
465 470 475 480 

Ser His Gin Leu Asn Tyr Ala Glu Cys Phe Leu Met Gin Asp Arg Arg 
485 490 495 

Gly Thr lie Pro Phe Phe Thr Trp Thr His Arg Ser Val Asp Phe Phe 
500 505 510 

Asn Thr He Asp Ala Glu Lys He Thr Gin Leu Pro Val Val Lys Ala 
515 520 525 

Tyr Ala Leu Ser Ser Gly Ala Ser He lie Glu Gly Pro Gly Phe Thr 
530 535 540 

Gly Gly Asn Leu Leu Phe Leu Lys Glu Ser Ser Asn Ser He Ala Lys 
545 550 555 560 

Phe Lys Val Thr Leu Asn Ser Ala Ala Leu Leu Gin Arg Tyr Arg Val 
565 570 575 

Arg He Arg Tyr Ala Ser Thr Thr Asn Leu Arg Leu Phe Val Gin Asn 
580 585 590 

Ser Asn Asn Asp Phe Leu Val He Tyr He Asn Lys Thr Met Asn He 
595 600 605 

Asp Gly Asp Leu Thr Tyr Gin Thr Phe Asp Phe Ala Thr Ser Asn Ser 
610 615 620 
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Asn Met Gly Phe 
625 

Ser Phe Val Ser 
Pro Val Gin 



Ser Gly Asp Thr 
630 

Asn Glu Lys lie 
645 



Asn Asp Phe lie 
635 

Tyr lie Asp Lys 
650 



lie Gly Ala Glu 
640 

He Glu Phe He 
655 



(2) INFORMATION FOR SEQ ID NO: 113: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 652 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 113: 

Met He Arg Lys Gly Gly Arg Lys Met Asn Pro Asn Asn Arg Ser Glu 
15 10 15 

His Asp Thr He Lys Thr Thr Glu Asn Asn Glu Val Pro Thr Asn His 
20 25 30 

Val Gin Tyr Pro Leu Ala Glu Thr Pro Asn Pro Thr Leu Glu Asp Leu 
35 40 45 

Asn Tyr Lys Glu Phe Leu Arg Met Thr Ala Asp Asn Asn Thr Glu Ala 
50 55 60 

Leu Asp Ser Ser Thr Thr Lys Asp Val He Gin Lys Gly He Ser Val 
65 70 75 80 

Val Gly Asp Leu Leu Gly Val Val Gly Phe Pro Phe Gly Gly Ala Leu 
85 90 95 

Val Ser Phe Tyr Thr Asn Phe Leu Asn Thr He Trp Pro Ser Glu Asp 
100 105 110 

Pro Trp Lys Ala Phe Met Glu Gin Val Glu Ala Leu Met Asp Gin Lys 
115 120 125 

He Ala Asp Tyr Ala Lys Asn Lys Ala Leu Ala Glu Leu Gin Gly Leu 
130 135 140 

Gin Asn Asn Val Glu Asp Tyr Val Ser Ala Leu Ser Ser Trp Gin Lys 
145 150 155 160 

Asn Pro Val Ser Ser Arg Asn Pro His Ser Gin Gly Arg He Arg Glu 
165 170 175 



A IJ553S<2WKVOI' DOC) 



-454- 



Leu Phe Ser Gin Ala Glu Ser His Phe Arg Asn Ser Met Pro Ser Phe 
180 185 190 



Ala lie Ser Gly Tyr Glu Val Leu Phe Leu Thr Thr Tyr Ala Gin Ala 
195 200 205 

Ala Asn Thr His Leu Phe Leu Leu Lys Asp Ala Gin lie Tyr Gly Glu 
210 215 220 

Glu Trp Gly Tyr Glu Lys Glu Asp lie Ala Glu Phe Tyr Lys Arg Gin 
225 230 235 240 

Leu Lys Leu Thr Gin Glu Tyr Thr Asp His Cys Val Lys Trp Tyr Asn 
245 250 255 

Val Gly Leu Asp Lys Leu Arg Gly Ser Ser Tyr Glu Ser Trp Val Asn 
260 265 270 

Phe Asn Arg Tyr Arg Arg Glu Met Thr Leu Thr Val Leu Asp Leu lie 
275 280 285 

Ala Leu Phe Pro Leu Tyr Asp Val Arg Leu Tyr Pro Lys Glu Val Lys 
290 295 300 

Thr Glu Leu Thr Arg Asp Val Leu Thr Asp Pro lie Val Gly Val Asn 
305 310 315 320 

Asn Leu Arg Gly Tyr Gly Thr Thr Phe Ser Asn lie Glu Asn Tyr He 
325 330 335 

Arg Lys Pro His Leu Phe Asp Tyr Leu His Arg He Gin Phe His Thr 
340 345 350 

Arg Phe Gin Pro Gly Tyr Tyr Gly Asn Asp Ser Phe Asn Tyr Trp Ser 
355 360 365 

Gly Asn Tyr Val Ser Thr Arg Pro Ser He Gly Ser Asn Asp He He 
370 375 380 

Thr Ser Pro Phe Tyr Gly Asn Lys Ser Ser Glu Pro Val Gin Asn Leu 
385 390 395 400 

Glu Phe Asn Gly Glu Lys Val Tyr Arg Ala Val Ala Asn Thr Asn Leu 
405 410 415 

Ala Val Trp Pro Ser Ala Val Tyr Ser Gly Val Thr Lys Val Glu Phe 
420 425 430 

Ser Gin Tyr Asn Asp Gin Thr Asp Glu Ala Ser Thr Gin Thr Tyr Asp 
435 440 445 

Ser Lys Arg Asn Val Gly Ala Val Ser Trp Asp Ser He Asp Gin Leu 
450 455 460 



-455- 

A I35S3H2WXV01VDOC) 



Pro Pro Glu Thr Thr Asp Glu Pro Leu Glu Lys Gly Tyr Ser His Gin 
465 470 475 480 



Leu Asn Tyr Val Met Cys Phe Leu Met Gin Gly Ser Arg Gly Thr lie 
485 490 495 

Pro Val Leu Thr Trp Thr His Lys Ser Val Asp Phe Phe Asn Met lie 
500 505 510 

Asp Ser Lys Lys lie Thr Gin Leu Pro Leu Val Lys Ala Tyr Lys Leu 
515 520 525 

Gin Ser Gly Ala Ser Val Val Ala Gly Pro Arg Phe Thr Gly Gly Asp 
530 535 540 

lie He Gin Cys Thr Glu Asn Gly Ser Ala Ala Thr He Tyr Val Thr 
545 550 555 560 

Pro Asp Val Ser Tyr .Ser Gin Lys Tyr Arg Ala Arg He His Tyr Ala 
565 570 575 

Ser Thr Ser Gin He Thr Phe Thr Leu Ser Leu Asp Gly Ala Pro Phe 
580 585 590 

Asn Gin Tyr Tyr Phe Asp Lys Thr He Asn Lys Gly Asp Thr Leu Thr 
595 600 605 

Tyr Asn Ser Phe Asn Leu Ala Ser Phe Ser Thr Pro Phe Glu Leu Ser 
610 615 620 , 

Gly Asn Asn Leu Gin He Gly Val Thr Gly Leu Ser Ala Gly Asp Lys 
625 630 635 640 

Val Tyr He Asp Lys He Glu Phe He Pro Val Asn 
645 650 
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9.0 Appendix 

9.1 Crystal Coordinates of Cry3Bb 
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THE ATOMIC MODEL INCLUDES RESIDUES 61 - 644 OF THE PROTEIN 
AND 106 BOUND WATER MOLECULES. BULK SOLVENT CONTRIBUTION 
TO THE STRUCTURE FACTOR WAS CALCULATED USING THE CCP4 
PROGRAM SFALL AND INCLUDED IN THIS REFINEMENT. 



CRYIIIA BELONGS TO THE "CRY" FAMILY OF DELTA- ENDOTOXINS , 
WHICH ARE PORE- FORMING INSECTICIDAL PROTEIN TOXINS 
CONTAINED IN THE CRYSTALLINE PARAS PORAL INCLUSIONS OF 
BACILLUS THURINGIENSIS. THE SUBCLASS III IS TOXIC 
SPECIFICALLY TO COLEOPTERAN INSECTS, I.E., BEETLES. THEY 
FUNCTION BY BINDING TO MIDGUT EPITHELIAL CELLS AND INDUCE 
COLLOIDOSMOTIC LYSIS . 

THE SEQUENCE OF CRYIIIA IN THE ATOMIC MODEL IS TAKEN FROM: 
HOEFTE , H., SEURINCK, J., VAN HOUTVEN, A. AND VAECK, M . 
(1987) NUCLEIC ACIDS RES. 15:7183. EMBL ACCESSION NUMBER 
P07130, ENTRY NAME CR70_BACTT. RESIDUES 1-57 ARE 
REMOVED IN THE MATURE TOXIN. RESIDUES 58 - 60 ARE 
INVISIBLE IN THE CRYSTAL STRUCTURE. 

THE STRUCTURE WAS REPORTED IN PAPER [13 ABOVE AFTER 
PRELIMINARY REFINEMENT. THE COORDINATES BEING DEPOSITED 
HERE RESULTED FROM FURTHER REFINEMENT. 
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14 


.09 


1DLC1917 


ATOM 


1794 


C 


ALA 


281 • 


39, 


. 244 


50 , 


.047 


47 


.686 


1 


. 00 


16 


. 05 


1DLC1918 


ATOM 


1795 


O 


ALA 


281 


39. 


. 711 


50, 


.300 


48 


.792 


1 


.00 


19 


.60 


1DLC1919 


ATOM 


1796 


CB 


ALA 


281 


37. 


534 


48. 


.218 


47. 


.758 


1 


.00 


10 . 


.28 


1DLC1920 


ATOM 


1797 


N 


LEU 


282 


39. 


,968 


50. 


.109 


46, 


.568 


1 


.00 


17 


.34 


1DLC1921 


ATOM 


1798 


CA 


LEU 


282 


41 . 


.404 


50. 


,412 


46. 


.566 


1 


.00 


17 


. 85 


1DLC1922 


ATOM 


1799 


C 


LEU 


282 


41 . 


, 772 


51 . 


.901 


46. 


.551 


1 


. 00 


18 


. 73 


1DLC1923 


ATOM 


1800 


O 


LEU 


282 


42 . 


, 853 


52. 


.282 


47 , 


.002 


1 


. 00 


20 


.35 


1DLC1924 


ATOM 


1801 


CB 


LEU 


282 


42 . 


,088 


49. 


692 


45. 


.396 


1 


.00 


13 . 


. 56 


1DLC1925 


ATOM 


1802 


CG 


LEU 


282 


41. 


818 


48. 


181 


45 . 


.369 


1 


.00 


15. 


. 77 


1DLC1926 


ATOM 


1803 


CD1 


LEU 


282 


42. 


557 


47 . 


527 


44 . 


.222 


1 . 


.00 


15. 


.69 


1DLC1927 


ATOM 


1804 


CD2 


LEU 


282 


42. 


219 


47. 


554 


46. 


.693 


1 . 


.00 


11. 


.02 


1DLC1928 


ATOM 


1805 


N 


PHE 


283 


40. 


853 


52. 


738 


46. 


.078 


1 . 


.00 


17 . 


.57 


1DLC1929 


ATOM 


1806 


CA 


PHE 


283 


41. 


056 


54. 


189 


45. 


.993 


1 . 


.00 


16 . 


.08 


1DLC1930 


ATOM 


1807 


C 


PHE 


283 


41. 


766 


54. 


859 


47 . 


.190 


1 , 


.00 


17. 


.36 


1DLC1931 


ATOM 


1808 


O 


PHE 


283 


42. 


711 


55. 


622 


47. 


.003 


1 , 


.00 


18 . 


.59 


1DLC1932 


ATOM 


1809 


CB 


PHE 


283 


39. 


714 


54. 


910 


45. 


,750 


1 . 


.00 


16 . 


79 


1DLC193 3 


ATOM 


1810 


CG 


PHE 


283 


39. 


107 


54. 


676 


44 . 


.384 


1 . 


,00 


11. 


95 


1DLC1934 


ATOM 


1811 


CD1 


PHE 


283 


39. 


792 


53. 


973 


43 . 


397 


1 . 


00 


9, 


94 


1DLC1935 


ATOM 


1812 


CD2 


PHE 


283 


37. 


851 


55. 


199 


44 . 


083 


1 . 


,00 


11 . 


.93 


1DLC1936 


ATOM 


1813 


CE1 


PHE 


283 


39. 


231 


53 . 


797 


42 . 


123 


1 . 


.00 


15. 


,02 


1DLC193 7 


ATOM 


1814 


CE2 


PHE 


283 


37, 


284 


55. 


027 


42. 


815 


1 . 


,00 


13 . 


,95 


1DLC1938 


ATOM 


1815 


CZ 


PHE 


283 


37. 


978 


54. 


325 


41 . 


832 


1 . 


,00 


8 . 


45 


1DLC1939 


ATOM 


1816 


N 


PRO 


284 


41. 


325 


54 . 


577 


48 . 


432 


1 . 


,00 


16 . 


44 


1DLC1940 


ATOM 


1817 


CA 


PRO 


284 


41. 


954 


55. 


188 


49. 


613 


1 . 


.00 


16 . 


,04 


1DLC1941 


ATOM 


1818 


C 


PRO 


284 


43 .435 


54. 


855 


49. 


,775 


1. 


,00 


18. 


,64 


1DLC1942 


ATOM 


1819 


0 


PRO 


284 


44. 


206 


55. 


624 


50. 


357 


1 . 


,00 


20. 


90 


1DLC1943 


ATOM 


1820 


CB 


PRO 


284 


41. 


154 


54 . 


593 


50 . 


,771 


1 . 


.00 


11 . 


65 


1DLC1944 


ATOM 


1821 


CG 


PRO 


284 


39. 


,841 


54. 


292 


50, 


,168 


1 . 


,00 


13 . 


,79 


1DLC1945 


ATOM 


1822 


CD 


PRO 


284 


40. 


210 


53 . 


707 


48. 


843 


1 . 


00 


14 . 


01 


1DLC1946 


ATOM 


1823 


N 


LEU 


285 


43. 


825 


53 . 


697 


49. 


263 


1 . 


00 


19 . 


68 


1DLC1947 


ATOM 


1824 


CA 


LEU 


285 


45. 


198 


53. 


244 


49. 


366 


1 . 


00 


17. 


84 


1DLC1948 


ATOM 


1825 


C 


LEU 


285 


46. 


126 


53. 


959 


48 . 


383 


1 . 


00 


17. 


73 


1DLC1949 


ATOM 


1826 


O 


LEU 


285 


47. 


334 


53 . 


743 


48. 


391 


1 . 


00 


22 . 


07 


1DLC1950 


ATOM 


1827 


CB 


LEU 


285 


45. 


,248 


51. 


719 


49. 


220 


1 . 


,00 


19. 


70 


1DLC1951 


ATOM 


1828 


CG 


LEU 


285 


44. 


269 


50. 


970 


50. 


149 


1 . 


00 


20 . 


17 


1DLC1952 


ATOM 


1829 


CD1 


LEU 


285 


44 . 


337 


49. 


484 


49. 


897 


1 . 


00 


22. 


11 


1DLC1953 


ATOM 


1830 


CD2 


LEU 


285 


44. 


558 


51. 


263 


51. 


609 


1 . 


00 


17. 


27 


1DLC1954 


ATOM 


1831 


N 


TYR 


286 


45. 


554 


54. 


817 


47. 


54 3 


1 . 


00 


17 . 


08 


1DLC1955 


ATOM 


1832 


CA 


TYR 


286 


46. 


335 


55. 


608 


46 . 


588 


1 . 


00 


16. 


87 


1DLC1956 


ATOM 


1833 


C 


TYR 


286 


46. 


878 


56. 


856 


47 . 


292 


1 . 


00 


17. 


31 


1DLC1957 


ATOM 


1834 


O 


TYR 


286 


47. 


635 


57. 


630 


46. 


709 


1 . 


00 


18. 


04 


1DLC1958 


ATOM 


1835 


CB 


TYR 


286 


45. 


484 


56. 


044 


45. 


395 


1 . 


00 


15. 


25 
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ATOM 


1836 


CG 


TYR 


236 


44 


. 982 


54 


.917 


44 


.529 


1 


.00 


17 


. 75 


1DLC1960 


ATOM 


1837 


CD1 


TYR 


286 


45 


.468 


53 


.615 


44 


.680 


1 


. 00 


19 


. 64 


1DLC1961 


ATOM 


1838 


CD2 


TYR 


286 


44 


.008 


55 


. 149 


43 


.559 


1 


.00 


19 


. 56 


1DLC1962 


ATOM 


1839 


CE1 


TYR 


286 


44 


. 990 


52 


.579 


43 


.888 


1 


. 00 


19 


.02 


1DLC1963 


ATOM 


1840 


CE2 


TYR 


286 


43 


. 526 


54 


. 121 


42 


.763 


1 


.00 


17 


.69 


1DLC1964 


ATOM 


1841 


CZ 


TYR 


286 


44 


.016 


52 


.846 


42 


.933 


1 


.00 


17 


.71 


1DLC1965 


ATOM 


1842 


OH 


TYR 


286 


43 


.507 


51 


.832 


42 


.166 


1 


.00 


19 


.21 


1DLC1966 


ATOM 


1843 


N 


ASP 


287 


46 


.449 


57 


.080 


48 


.531 


1 


. 00 


15 


. 82 


1DLC1967 


ATOM 


1844 


CA 


ASP 


287 


46 


.942 


58 


.221 


49 


.279 


1 


.00 


15 


. 74 


1DLC1968 


ATOM 


1845 


C 


ASP 


287 


48 


.293 


57 


.856 


49 


.888 


1 


. 00 


15 


. 59 


1DLC1969 


ATOM 


1846 


O 


ASP 


287 


48 


.376 


57 


.359 


51 


.013 


1 


.00 


15 


.25 


1DLC1970 


ATOM 


1847 


CB 


ASP 


287 


45 


.958 


58 


.648 


50 


.365 


1 


.00 


14 


.24 


1DLC1971 


ATOM 


1848 


CG 


ASP 


287 


46 


.307 


60 


.007 


50 


.970 


1 


.00 


19 


.57 


1DLC1972 


ATOM 


1849 


OD1 


ASP 


287 


47 


.482 


60 


.422 


50 


.914 


1 


.00 


23 


.20 


1DLC1973 


ATOM 


1850 


OD2 


ASP 


287 


45 


.405 


60 


.672 


51 


.519 


1 


.00 


24 


.37 


1DLC1974 


ATOM 


1851 


N 


VAL 


288 


49 


.352 


58 


.154 


49 


.142 


1 


. 00 


17 


.00 


1DLC1975 


ATOM 


1852 


CA 


VAL 


288 


50 


.717 


57 


.860 


49 


.564 


1 


. 00 


19 


.40 


1DLC1976 


ATOM 


1853 


C 


VAL 


288 


51 


.236 


58 


.650 


50 


.761 


1 


.00 


20 


.76 


1DLC1977 


ATOM 


1854 


O 


VAL 


288 


52 


.334 


58 


.403 


51 


.236 


1 


.00 


24 


.76 


1DLC1978 


ATOM 


1855 


CB 


VAL 


288 


51 


. 702 


57. 


.989 


48 


.397 


1 


.00 


17 


.31 


1DLC1979 


ATOM 


1856 


CGI 


VAL 


288 


51 


.291 


57. 


.054 


47 


.285 


1 


.00 


18 


.35 


1DLC1980 


ATOM 


1857 


CG2 


VAL 


288 


51 


.762 


59. 


.418 


47 


.905 


1 


.00 


17 


.23 


1DLC1981 


ATOM 


1858 


N 


ARG 


289 


50 


.447 


59 . 


.598 


51 


.248 


1 


. 00 


23 


.06 


1DLC1982 


ATOM 


1859 


CA 


ARG 


289 


50 


. 846 


60 . 


.391 


52 


.410 


1 , 


.00 


22 


.35 


1DLC1983 


ATOM 


1860 


C 


ARG 


289 


50 


.361 


59. 


.717 


53 


.669 


1 


.00 


20 


.93 


1DLC1984 


ATOM 


1861 


O 


ARG 


289 


51 


.051 


59. 


.715 


54 


.684 


1 


.00 


21 


.28 


1DLC1985 


ATOM 


1862 


CB 


ARG 


289 


50 


.283 


61 . 


800 


52 . 


.311 


1 


. 00 


28, 


, 20 


1DLC1986 


ATOM 


1863 


CG 


ARG 


289 


50. 


.680 


62 . 


438 


51. 


.025 


1 . 


.00 


31. 


.27 


1DLC1987 


ATOM 


1864 


CD 


ARG 


289 


50 . 


.447 


63 . 


,905 


51 , 


.004 


1 . 


.00 


41. 


.78 


1DLC1988 


ATOM 


1865 


NE 


ARG 


289 


51. 


.389 


64 . 


,454 


50. 


.046 


1 . 


.00 


51. 


.47 


1DLC1989 


ATOM 


1866 


CZ 


ARG 


289 


52 . 


.620 


64 . 


833 


50. 


.361 


1 . 


.00 


52. 


.65 


1DLC1990 


ATOM 


1867 


NH1 


ARG 


289 


53 . 


. 034 


64 . 


.743 


51. 


.621 


1 , 


.00 


52. 


.33 


1DLC1991 


ATOM 


1868 


NH2 


ARG 


289 


53 . 


.469 


65. 


194 


49. 


.403 


1 . 


.00 


53. 


49 


1DLC1992 


ATOM 


1869 


N 


LEU 


290 


49. 


.142 


59. 


187 


53 .599 


1 . 


00 


21. 


77 


1DLC1993 


ATOM 


1870 


CA 


LEU 


290 


48 . 


.535 


58. 


454 


54, 


,710 


1 . 


00 


19. 


60 


1DLC1994 


ATOM 


1871 


C 


LEU 


290 


49. 


. 127 


57. 


042 


54, 


.690 


1 . 


00 


19. 


04 


1DLC1995 


ATOM 


1872 


O 


LEU 


290 


49. 


.307 


56. 


414 


55. 


,728 


1 . 


,00 


16. 


89 


1DLC1996 


ATOM 


1873 


CB 


LEU 


290 


47 . 


. 014 


58 . 


368 


54 . 


545 


1 . 


00 


17. 


72 


1DLC1997 


ATOM 


1874 


CG 


LEU 


290 


46 . 


. 117 


59. 


392 


55. 


,235 


1 . 


00 


17. 


03 


1DLC1998 


ATOM 


1875 


CD1 


LEU 


290 


44 . 


.683 


58. 


999 


54. 


982 


1 . 


00 


15. 


20 


1DLC1999 


ATOM 


1876 


CD2 


LEU 


290 


46. 


.382 


59. 


429 


56. 


728 


1. 


00 


13. 


88 


1DLC2000 


ATOM 


1877 


N 


TYR 


291 


49. 


.429 


56. 


555 


53. 


490 


1. 


00 


17. 


26 


1DLC2001 


ATOM 


1878 


CA 


TYR 


291 


50. 


.018 


55 . 


231 


53 . 


,316 


1 . 


00 


17. 


98 


1DLC2002 


ATOM 


1879 


C 


TYR 


291 


51. 


.379 


55. 


394 


52. 


,648 


1. 


00 


19. 


81 


1DLC2003 


ATOM 


1880 


O 


TYR 


291 


51. 


.533 


55. 


142 


51, 


,448 


1. 


,00 


19. 


73 


1DLC2004 


ATOM 


1881 


CB 


TYR 


291 


49. 


.092 


54 . 


352 


52 . 


,475 


1 . 


00 


16. 


70 


1DLC2005 


ATOM 


1882 


CG 


TYR 


291 


47. 


,755 


54. 


103 


53. 


142 


1. 


00 


17. 


30 


1DLC2006 


ATOM 


1883 


CD1 


TYR 


291 


47. 


.582 


53 . 


028 


54. 


024 


1 . 


00 


15. 


12 


1DLC2007 


ATOM 


1884 


CD2 


TYR 


291 


46 . 


,674 


54. 


963 


52. 


927 


1 . 


00 


12 . 


11 


1DLC2008 


ATOM 


1885 


CE1 


TYR 


291 


46 . 


.379 


52. 


820 


54 . 


672 


1 . 


00 


15. 


98 


1DLC2009 


ATOM 


1886 


CE2 


TYR 


291 


45. 


,469 


54 . 


763 


53 . 


576 


1 . 


00 


11. 


41 


1DLC2010 


ATOM 


1887 


CZ 


TYR 


291 


45. 


.330 


53 . 


689 


54. 


444 


1 . 


00 


13. 


44 


1DLC2011 


ATOM 


1888 


OH 


TYR 


291 


44 . 


140 


53 . 


472 


55. 


077 


1. 


00 


16. 


97 


1DLC2012 


ATOM 


1889 


N 


PRO 


292 


52. 


383 


55. 


841 


53. 


428 


1 . 


00 


18. 


71 


1DLC2013 


ATOM 


1890 


CA 


PRO 


292 


53 . 


760 


56. 


078 


52. 


988 


1. 


00 


19. 


79 


1DLC2014 


ATOM 


1891 


C 


PRO 


292 


54 . 


569 


54. 


810 


52. 


769 


1 . 


00 


21. 


30 


1DLC2015 


ATOM 


1892 


O 


PRO 


292 


55. 


746 


54 . 


877 


52. 


441 


1 . 


00 


27. 


34 


1DLC2016 


ATOM 


1893 


CB 


PRO 


292 


54 . 


351 


56. 


913 


54 . 


127 


1 . 


00 


20. 


00 


1DLC2017 


ATOM 


1894 


CG 


PRO 


292 


53 . 


159 


57. 


371 


54. 


919 


1. 


00 


24. 


03 


1DLC2018 


ATOM 


1895 


CD 


PRO 


292 


52. 


248 


56. 


202 


54. 


843 
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00 


17. 


48 


1DLC2019 



A I35S35(2WKV0M DOC) 



-575- 



ATOM 


1896 


N 


LYS 


293 


53 


. 959 


53 


.657 


53 


.010 


1 


.00 


20 


.08 


1DLC2020 


ATOM 


1897 


CA 


LYS 


293 


54 


. 625 


52 


.381 


52 


. 793 


1 


.00 


20 


.58 


1DLC2021 


ATOM 


1898 


C 


LYS 


293 


53 


.692 


51 


.508 


51 


. 950 


1 


.00 


23 


. 91 


1DLC2022 


ATOM 


1899 


O 


LYS 


293 


52 


.543 


51 


. 834 


51 


.690 


1 


.00 


27 


.52 


1DLC2023 


ATOM 


1900 


CB 


LYS 


293 


54 


. 881 


51 


.655 


54 


. 114 


1 


.00 


17 


. 96 


1DLC2024 


ATOM 


1901 


CG 


LYS 


293 


55 


.286 


52 


.512 


55 


.267 


1 


.00 


20 


.88 


1DLC2025 


ATOM 


1902 


CD 


LYS 


293 


55 


.560 


51 


.644 


56 


.484 


1 


.00 


25 


.87 


1DLC2026 


ATOM 


1903 


CE 


LYS 


293 


55 


. 546 


52 


.459 


57 


. 770 


1 


.00 


34 


.09 


1DLC2027 


ATOM 


1904 


NZ 


LYS 


293 


56 


.452 


53 


.641 


57 


. 721 


1 


.00 


38 


. 99 


1DLC2028 


ATOM 


1905 


N 


GLU 


294 


54 


. 180 


50 


. 345 


51 


. 530 


1 


. 00 


22 


. 97 


1DLC2029 


ATOM 


1906 


CA 


GLU 


294 


53 


.359 


49 


.411 


50 


. 772 


1 


.00 


20 


.92 


1DLC2 03O 


ATOM 


1907 


C 


GLU 


294 


52 


.267 


48 


.916 


51 


.722 


1 


.00 


20 


.22 


1DLC2031 


ATOM 


1908 


O 


GLU 


294 


52 


.471 


48 


.835 


52 


. 938 


1 


.00 


19 


.74 


1DLC2032 


ATOM 


1909 


CB 


GLU 


2 94 


54 


. 192 


48 


.218 


50 


.314 


1 


.00 


24 


.35 


1DLC2033 


ATOM 


1910 


CG 


GLU 


294 


55 


.435 


48 


.586 


49 


.521 


1 


.00 


36 


.21 


1DLC2034 


ATOM 


1911 


CD 


GLU 


294 


56 


.236 


47 


367 


49 


.086 


1 


.00 


37 


.88 


1DLC2035 


ATOM 


1912 


OE1 


GLU 


294 


56 


648 


46 


574 


49 


. 966 


1 


. 00 


43 


.62 


1DLC2036 


ATOM 


1913 


OE2 


GLU 


294 


56 


450 


47 


207 


47 


.864 


1 


. 00 


40 


.00 


1DLC2037 


ATOM 


1914 


N 


VAL 


295 


51 


101 


48 


607 


51 


.177 


1 


.00 


21 


.65 


1DLC2038 


ATOM 


1915 


CA 


VAL 


295 


49 


999 


48 


127 


52 


002 


1 


.00 


21 


78 


1DLC2039 


ATOM 


1916 


C 


VAL 


295 


49 


628 


46 


669 


51 


752 


1 


.00 


20 


58 


1DLC2040 


ATOM 


1917 


O 


VAL 


295 


49 


519 


46 


236 


50 


608 


1 


00 


21 


08 


1DLC2041 


ATOM 


1918 


CB 


VAL 


295 


48 


732 


48 


989 


51 


809 


1 


00 


20 


64 


1DLC2042 


ATOM 


1919 


CGI 


VAL 


295 


47 


553 


48 


390 


52 


569 


1 


00 


22 


03 


1DLC2043 


ATOM 


1920 


CG2 


VAL 


295 


48 


991 


50 


400 


52 


284 


1 


00 


22 


52 


1DLC2044 


ATOM 


1921 


N 


LYS 


296 


49 


509 


45 


905 


52 


834 


1 


00 


18 


88 


1DLC2045 


ATOM 


1922 


CA 


LYS 


296 


49 


082 


44 


516 


52 


754 


1 


00 


17 


81 


1DLC2046 


ATOM 


1923 


C 


LYS 


2 96 


47 


586 


44 


560 


53 


085 


1 


00 


20 


39 


1DLC2047 


ATOM 


1924 


O 


LYS 


296 


47 


195' 


45 


013 


54 


172 


1 


00 


20 


26 


1DLC2048 


ATOM 


1925 


CB 


LYS 


296 


49 


809 


43 


651 


53 


782 


1 


00 


15 


08 


1DLC2049 


ATOM 


1926 


CG 


LYS 


296 


49 


313 


42 


216 


53 


816 


1 


00 


12. 


45 


1DLC2050 


ATOM 


1927 


CD 


LYS 


296 


49 


980 


41 


428 


54 


928 


1 


00 


16 


66 


1DLC2051 


ATOM 


1928 


CE 


LYS 


296 


49 


434 


40 


008 


55 


028 


1 


00 


12 


92 


1DLC2052 


ATOM 


1929 


NZ 


LYS 


296 


48 


362 


39 


892 


56 


047 


1 


00 


15. 


35 


1DLC2053 


ATOM 


1930 


N 


THR 


297 


46 


747 


44 


166 


52 


135 


1 


00 


20. 


26 


1DLC2054 


ATOM 


1931 


CA 


THR 


297 


45 


309 


44 


186 


52. 


373 


1 


00 


19. 


90 


1DLC2055 


ATOM 


1932 


C 


THR 


297 


44 


572 


43 


049 


51 


671 


1 


00 


19. 


99 


1DLC2056 


ATOM 


1933 


O 


THR 


297 


45 


177 


42 


286 


50 


923 


1 


00 


24 . 


16 


1DLC2057 


ATOM 


1934 


CB 


THR 


297 


44 


701 


45 


550 


51 


980 


1 


00 


17. 


52 


1DLC2058 


ATOM 


1935 


OG1 


THR 


297 


43 


325 


45 


579 


52 


366 


1 


00 


18. 


25 


1DLC2059 


ATOM 


1936 


CG2 


THR 


297 


44 


829 


45. 


799 


50. 


476 


1 


00 


14. 


77 


1DLC2060 


ATOM 


1937 


N 


GLU 


298 


43 


270 


42 


935 


51 


913 


1 


00 


20. 


33 


1DLC2061 


ATOM 


1938 


CA 


GLU 


298 


42 


461 


41 


874 


51 


307 


1 


00 


19. 


29 


1DLC2062 


ATOM 


1939 


C 


GLU 


298 


40 


997 


42 


259 


51. 


106 


1 


00 


17. 


75 


1DLC2063 


ATOM 


1940 


O 


GLU 


298 


40 


470 


43 


129 


51. 


795 


1. 


00 


20. 


54 


1DLC2064 


ATOM 


1941 


CB 


GLU 


298 


42 


536 


40 


605 


52. 


166 


1 


00 


14 . 


48 


1DLC2065 


ATOM 


1942 


CG 


GLU 


298 


42 


130 


40 


827 


53 


611 


1. 


00 


14 . 


45 


1DLC2066 


ATOM 


1943 


CD 


GLU 


298 


42 


441 


39 


644 


54 


506 


1. 


00 


15. 


68 


1DLC2067 


ATOM 


1944 


OE1 


GLU 


298 


43 


588 


39 


157 


54 . 


490 


1 


00 


22. 


29 


1DLC2068 


ATOM 


1945 


OE2 


GLU 


298 


41 


544 


39 


211 


55. 


247 


1 


00 


19. 


26 


1DLC2069 


ATOM 


1946 


N 


LEU 


299 


40 . 


349 


41 


596 


50. 


156 


1 . 


00 


17. 


55 


1DLC2070 


ATOM 


1947 


CA 


LEU 


299 


38 


933 


41 


826 


49. 


856 


1 . 


00 


18. 


90 


1DLC2071 


ATOM 


1948 


C 


LEU 


299 


38. 


165 


40 


652 


50. 


488 


1 . 


00 


21. 


39 


1DLC2072 


ATOM 


1949 


O 


LEU 


299 


38. 


385 


39 


493 


50. 


108 


1 . 


00 


23. 


60 


1DLC2073 


ATOM 


1950 


CB 


LEU 


299 


38. 


717 


41. 


862 


48. 


338 


1 . 


00 


13. 


32 


1DLC2074 


ATOM 


1951 


CG 


LEU 


299 


39. 


472 


42 . 


945 


47. 


563 


1. 


00 


10. 


97 


1DLC2075 


ATOM 


1952 


CD1 


LEU 


299 


39. 


359 


42 . 


688 


46. 


085 


1. 


00 


14. 


82 


1DLC2076 


ATOM 


1953 


CD2 


LEU 


299 


38. 


929 


44 . 


312 


47. 


894 


1 . 


00 


9. 


97 


1DLC2077 


ATOM 


1954 


N 


THR 


300 


37. 


291 


40 . 


943 


51. 


455 


1 . 


00 


19. 


12 


1DLC2078 


ATOM 


1955 


CA 


THR 


300 


36. 


539 


39 . 


902 


52. 


163 


1 . 


00 


17. 


44 


1DLC2079 
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ATOM 


1956 


C 


THR 


300 


35 


. 105 


33 


. 598 


51 


.710 


1 


. 00 


20 


.09 


1DLC2080 


ATOM 


1957 


0 


THR 


300 


34 


.439 


38 


. 743 


52 


.297 


1 


. 00 


21 


. 20 


1DLC2081 


ATOM 


1958 


CB 


THR 


300 


36 


.492 


40 


. 200 


53 


.680 


1 


. 00 


18 


. 14 


1DLC2082 


ATOM 


1959 


OG1 


THR 


300 


35 


.829 


41 


.449 


53 


.904 


1 


.00 


20 


. 85 


1DLC2083 


ATOM 


1960 


CG2 


THR 


300 


37 


.899 


40 


. 264 


54 


.270 


1 


. 00 


17 


.00 


1DLC2084 


ATOM 


1961 


N 


ARG 


301 


34 


.631 


40 


. 268 


50 


.667 


1 


. 00 


20 


. 53 


1DLC2085 


ATOM 


1962 


CA 


ARG 


301 


33 


.259 


40 


.074 


50 


.184 


1 


.00 


18 


.35 


1DLC2086 


ATOM 


1963 


C 


ARG 


301 


32 


.963 


38 


. 762 


49 


.451 


1 


. 00 


19 


. 18 


1DLC2087 


ATOM 


1964 


O 


ARG 


301 


33 


.850 


38 


. 158 


48 


.849 


1 


.00 


18 


. 71 


1DLC2088 


ATOM 


1965 


CB 


ARG 


301 


32 


.842 


41 


. 248 


49 


.291 


1 


. 00 


13 


. 09 


1DLC2089 


ATOM 


1966 


CG 


ARG 


301 


33 


.280 


41 


. 126 


47 


.830 


1 


.00 


13 


.61 


1DLC2090 


ATOM 


1967 


CD 


ARG 


301 


34 


.792 


41 


. 044 


47 


.678 


1 


.00 


14 


.65 


1DLC2091 


ATOM 


1968 


NE 


ARG 


301 


35 


.197 


40 


. 844 


46 


.291 


1 


.00 


14 


.20 


1DLC2092 


ATOM 


1969 


CZ 


ARG 


301 


35 


.305 


39 


.661 


45 


.690 


1 


.00 


17 


. 27 


1DLC2093 


ATOM 


1970 


NH1 


ARG 


301 


35 


.040 


38 


. 534 


46 


.343 


1 


.00 


13 


. 79 


1DLC2094 


ATOM 


1971 


NH2 


ARG 


301 


35 


.667 


39 


.613 


44 


.420 


1 


. 00 


14 


. 12 


1DLC2095 


ATOM 


1972 


N 


ASP 


302 


31 


.690 


38 


.370 


49 


.478 


1 


.00 


19 


. 99 


1DLC2096 


ATOM 


1973 


CA 


ASP 


302 


31 


.196 


37 


. 161 


48 


.817 


1 


.00 


20 


.22 


1DLC2097 


ATOM 


1974 


C 


ASP 


302 


30 


.662 


37 


.483 


47 


.432 


1 


.00 


20 


. 75 


1DLC2098 


ATOM 


1975 


O 


ASP 


302 


30 


.141 


38. 


. 573 


47 


.196 


1 , 


. 00 


24 , 


. 04 


1DLC2099 


ATOM 


1976 


CB 


ASP 


302 


29 


.980 


36. 


. 582 


49 


.556 


1 


.00 


24 


.46 


1DLC2100 


ATOM 


1977 


CG 


ASP 


302 


30 


.316 


35. 


, 978 


50 


.886 


1 


. 00 


31. 


.05 


1DLC2101 


ATOM 


1978 


OD1 


ASP 


302 


31 , 


.510 


35. 


.740 


51 


. 162 


1 


.00 


41 . 


.38 


1DLC2102 


ATOM 


1979 


OD2 


ASP 


302 


29, 


.363 


35. 


.721 


51, 


.657 


1 


.00 


34 . 


.73 


1DLC2103 


ATOM 


1980 


N 


VAL 


303 


30 , 


.741 


36. 


.501 


46 


.544 


1 


.00 


18 , 


,31 


1DLC2104 


ATOM 


1981 


CA 


VAXi 


303 


30 . 


.182 


36. 


,601 


45, 


.199 


1 


.00 


18 , 


. 50 


1DLC2105 


ATOM 


1982 


C 


VAL 


303 


29 . 


,592 


35. 


219 


44 . 


.926 


1 . 


. 00 


17 . 


19 


1DLC2106 


ATOM 


1983 


0 


VAL 


303 


30. 


131 


34 . 


204 


45. 


,370 


1 , 


, 00 


18 . 


03 


1DLC2107 


ATOM 


1984 


CB 


VAL 


303 


31 . 


204 


37. 


008 


44 . 


094 


1 . 


, 00 


19 . 


37 


1DLC2108 


ATOM 


1985 


CGI 


VAL 


303 


31. 


744 


38. 


402 


44. 


,369 


1 . 


,00 


21. 


25 


1DLC2109 


ATOM 


1986 


CG2 


VAL 


303 


32 . 


328 


35. 


992 


43 . 


973 


1 . 


,00 


21. 


50 


1DLC2110 


ATOM 


1987 


N 


LEU 


304 


28. 


410 


35. 


201 


44. 


,330 


1 . 


00 


16 . 


94 


1DLC2111 


ATOM 


1988 


CA 


LEU 


304 


27. 


713 


33. 


962 


44 . 


022 


1 . 


00 


13 . 


97 


1DLC2112 


ATOM 


1989 


C 


LEU 


304 


27. 


764 


33. 


726 


42 . 


527 


1 . 


00 


14 . 


14 


1DLC2113 


ATOM 


1990 


O 


LEU 


304 


27. 


474 


34 . 


622 


41. 


730 


1 . 


00 


13 . 


74 


1DLC2114 


ATOM 


1991 


CB 


LEU 


304 


26. 


235 


34. 


053 


44 . 


433 


1 . 


00 


14 . 


18 


1DLC2115 


ATOM 


1992 


CG 


LEU 


304 


25. 


680 


34. 


294 


45. 


846 


1 . 


00 


13 . 


89 


1DLC2116 


ATOM 


1993 


CD1 


LEU 


304 


25. 


252 


32. 


993 


46. 


457 


1 . 


00 


14 . 


90 


1DLC2117 


ATOM 


1994 


CD2 


LEU 


304 


26. 


641 


35. 


052 


46. 


745 


1 . 


00 


11. 


27 


1DLC2118 


ATOM 


1995 


N 


THR 


305 


28. 


146 


32. 


521 


42 . 


140 


1 . 


00 


14 . 


38 


1DLC2119 


ATOM 


1996 


CA 


THR 


305 


28 . 


175 


32 . 


187 


40 . 


721 


1 . 


00 


16 . 


18 


1DLC2120 


ATOM 


1997 


C 


THR 


305 


26 . 


737 


31. 


864 


40. 


282 


1 . 


00 


17. 


39 


1DLC2121 


ATOM 


1998 


0 


THR 


305 


25. 


845 


31. 


688 


41 . 


124 


1 . 


00 


15. 


94 


1DLC2122 


ATOM 


1999 


CB 


THR 


305 


29. 


115 


30. 


969 


40. 


422 


1 . 


00 


17. 


34 


1DLC2123 


ATOM 


2000 


OG1 


THR 


305 


29. 


021 


30. 


000 


41. 


475 


1 . 


00 


19. 


98 


1DLC2124 


ATOM 


2001 


CG2 


THR 


305 


30. 


565 


31. 


415 


40. 


270 


1 . 


00 


18. 


26 


1DLC2125 


ATOM 


2002 


N 


ASP 


306 


26. 


488 


31. 


891 


38 . 


978 


1 . 


00 


15. 


04 


1DLC2126 


ATOM 


2003 


CA 


ASP 


306 


25. 


177 


31. 


551 


38. 


437 


1 . 


00 


17. 


79 


1DLC2127 


ATOM 


2004 


C 


ASP 


306 


24 . 


785 


30. 


157 


38. 


913 


1 . 


00 


21. 


19 


1DLC2128 


ATOM 


2005 


0 


ASP 


306 


25. 


651 


29. 


278 


39. 


067 


1 . 


00 


24 . 


55 


1DLC2129 


ATOM 


2006 


CB 


ASP 


306 


25. 


222 


31. 


530 


36 . 


912 


1 . 


00 


15. 


07 


1DLC2130 


ATOM 


2007 


CG 


ASP 


306 


25. 


286 


32. 


905 


36. 


310 


1 . 


00 


15. 


88 


1DLC2131 


ATOM 


2008 


OD1 


ASP 


306 


24 . 


822 


33. 


867 


36. 


946 


1 . 


00 


18. 


05 


1DLC2132 


ATOM 


2009 


OD2 


ASP 


306 


25. 


777 


33. 


024 


35. 


174 


1 . 


00 


23 . 


36 


1DLC213 3 


ATOM 


2010 


N 


PRO 


307 


23 . 


483 


29. 


938 


39. 


179 


1 . 


00 


22. 


83 


1DLC2134 


ATOM 


2011 


CA 


PRO 


307 


23 . 


004 


28. 


623 


39. 


639 


1 . 


00 


20. 


21 


1DLC213 5 


ATOM 


2012 


C 


PRO 


307 


23 . 


289 


27. 


551 


38. 


587 


1 . 


00 


16. 


74 


1DLC2136 


ATOM 


2013 


O 


PRO 


307 


23 . 


289 


27. 


827 


37. 


389 


1 . 


00 


17. 


78 


1DLC213 7 


ATOM 


2014 


CB 


PRO 


307 


21. 


502 


28. 


848 


39. 


811 


1. 


00 


20. 


31 


1DLC2138 


ATOM 


2015 


CG 


PRO 


307 


21. 


410 


30. 


306 


40. 


150 


1. 


00 


21. 


80 


1DLC213 9 
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ATOM 


2016 


CD 


PRO 


307 


22 


. 388 


30 


. 925 


39 


. 175 


1 


. 00 


20 


.89 


1DLC2140 


ATOM 


2017 


N 


ILE 


308 


23 


. 547 


26 


.333 


39 


.033 


1 


. 00 


18 


.04 


1DLC2141 


ATOM 


2018 


CA 


ILE 


308 


23 


. 851 


25 


. 245 


38 


. 107 


1 


.00 


19 


. 10 


1DLC2142 


ATOM 


2019 


C 


ILE 


308 


22 


.579 


24 


. 514 


37 


.684 


1 


.00 


21 


. 11 


1DLC2143 


ATOM 


2020 


O 


ILE 


308 


22 


.004 


23 


.759 


38 


.473 


1 


.00 


23 


.40 


1DLC2144 


ATOM 


2021 


CB 


ILE 


308 


24 


.833 


24 


.227 


38 


.753 


1 


.00 


20 


.63 


1DLC2145 


ATOM 


2022 


CGI 


ILE 


308 


25 


. 992 


24 


.971 


39 


.439 


1 


.00 


21 


.46 


1DLC2146 


ATOM 


2023 


CG2 


ILE 


308 


25 


. 352 


23 


.252 


37 


. 700 


1 


.00 


19 


.02 


1DLC2147 


ATOM 


2024 


CD1 


ILE 


308 


26 


. 923 


24 


.082 


40 


.247 


1 


. 00 


17 


.99 


1DLC2148 


ATOM 


2025 


N 


VAL 


309 


22 


. 119 


24 


. 763 


36 


.460 


1 


.00 


23 


. 15 


1DLC2149 


ATOM 


2026 


CA 


VAL 


309 


20 


. 907 


24 


.108 


35 


. 944 


1 


.00 


25 


.59 


1DLC2150 


ATOM 


2027 


C 


VAL 


309 


21 


. 139 


23 


.415 


34 


.602 


1 


. 00 


25 


. 03 


1DLC2151 


ATOM 


2028 


O 


VAL 


309 


21 


. 908 


23 


.891 


33 


.778 


1 


. 00 


24 


.20 


1DLC2152 


ATOM 


2029 


CB 


VAL 


309 


19 


. 689 


25 


.093 


35 


. 808 


1 


.00 


23 


.31 


1DLC2153 


ATOM 


2030 


CGI 


VAL 


309 


19 


. 295 


25 


.652 


37 


. 160 


1 


.00 


24 


.72 


1DLC2154 


ATOM 


2031 


CG2 


VAL 


309 


20. 


.005 


26 


.214 


34 


.847 


1 


.00 


25 


. 99 


1DLC2155 


ATOM 


2032 


N 


GLY 


310 


20. 


.462 


22 , 


,290 


34 


.393 


1 


.00 


26 


. 11 


1DLC2156 


ATOM 


2033 


CA 


GLY 


310 


20 . 


, 610 


21. 


.549 


33 


. 151 


1 


.00 


28 


.39 


1DLC2157 


ATOM 


2034 


C 


GLY 


310 


19 , 


, 889 


22 . 


. 100 


31 


.926 


1 


. 00 


31 


.24 


1DLC2158 


ATOM 


2035 


O 


GLY 


310 


20 . 


. 257 


21 . 


, 777 


30 . 


.802 


1 


.00 


35 


.55 


1DLC2159 


ATOM 


2036 


N 


VAL 


311 


18. 


. 866 


22 . 


,927 


32 


. 128 


1 


.00 


32 


.38 


1DLC2160 


ATOM 


2037 


CA 


VAL 


311 


18 . 


, 102 


23 . 


.498 


31 . 


.012 


1 


.00 


30 


.87 


1DLC2161 


ATOM 


2038 


C 


VAL 


311 


18 . 


. Ill 


25. 


024 


31 . 


.015 


1 


.00 


29 


. 77 


1DLC2162 


ATOM 


2039 


O 


VAL 


311 


18 . 


, 175 


25. 


.648 


32 . 


.070 


1 


.00 


31 


.01 


1DLC2163 


ATOM 


2040 


CB 


VAL 


311 


16 . 


,616 


23 . 


022 


31 , 


.028 


1 


.00 


31 


.20 


1DLC2164 


ATOM 


2041 


CGI 


VAL 


311 


16 . 


.525 


21. 


.524 


30 . 


.803 


1 


.00 


32, 


.24 


1DLC2165 


ATOM 


2042 


CG2 


VAL 


311 


15. 


966 


23 . 


374 


32 . 


.347 


1 


. 00 


28. 


.27 


1DLC2166 


ATOM 


2043 


N 


ASN 


312 


17. 


976 


25. 


613 


29 . 


.832 


1 . 


.00 


34 


.45 


1DLC2167 


ATOM 


2044 


CA 


ASN 


312 


17. 


968 


27. 


071 


29. 


,683 


1 , 


.00 


38, 


.67 


1DLC2168 


ATOM 


2045 


C 


ASN 


312 


16. 


678 


27. 


741 


30. 


.145 


1 , 


.00 


37. 


.69 


1DLC216 9 


ATOM 


2046 


O 


ASN 


312 


16. 


696 


28. 


874 


30. 


.619 


1 . 


.00 


39. 


.59 


1DLC2170 


ATOM 


2047 


CB 


ASN 


312 


18. 


219 


27. 


462 


28 . 


,225 


1 . 


.00 


46. 


. 17 


1DLC2171 


ATOM 


2048 


CG 


ASN 


312 


19. 


670 


27. 


287 


27. 


, 806 


1 . 


.00 


53 . 


.31 


1DLC2172 


ATOM 


2049 


OD1 


ASN 


312 


20. 


581 


27. 


275 


28 . 


.638 


1 , 


.00 


56. 


.28 


1DLC2173 


ATOM 


2050 


ND2 


ASN 


312 


19, 


894 


27. 


181 


26 . 


,500 


1 . 


.00 


59. 


.74 


1DLC2174 


ATOM 


2051 


N 


ASN 


313 


15. 


.559 


27. 


048 


29. 


963 


1 . 


.00 


35. 


.04 


1DLC2175 


ATOM 


2052 


CA 


ASN 


313 


14. 


.243 


27. 


559 


30 . 


,339 


1 , 


.00 


31. 


.97 


1DLC2176 


ATOM 


2053 


C 


ASN 


313 


13. 


,538 


26. 


549 


31. 


,247 


1 . 


.00 


30. 


, 91 


1DLC2177 


ATOM 


2054 


0 


ASN 


313 


13 . 


,221 


25. 


432 


30 . 


.819 


1 . 


.00 


32. 


.94 


1DLC2178 


ATOM 


2055 


CB 


ASN 


313 


13 . 


.427 


27. 


807 


29. 


,065 


1 . 


.00 


31. 


.98 


1DLC2179 


ATOM 


2056 


CG 


ASN 


313 


12 . 


,051 


28. 


364 


29. 


339 


1 . 


.00 


29. 


. 11 
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3 016 


CG 


GLN 


430 


49 . 


735 


24 . 


035 


54 . 


821 


1 . 


00 


77 . 


24 


1DLC3 140 


ATOM 


3017 


CD 


GLN 


430 


48 . 


372 


24 . 


495 


54 , 


295 


1 . 


00 


83 . 


08 


1DLC3141 


ATOM 


3018 


OE1 


GLN 


430 


48 . 


121 


25 . 


694 


54 . 


164 


1 . 


00 


86 . 


17 


1DLC3 142 


ATOM 


3 019 


NE2 


GLN 


4 30 


47 . 


483 


23 . 


546 


54 . 


020 


1 . 


00 


82 . 


81 


1DLC3143 


ATOM 


3020 


N 


THR 


431 


51 . 


643 


21 . 


973 


51 . 


409 


1 . 


00 


57 . 


70 


1DLC3 144 


ATOM 


3021 


CA 


THR 


431 


52 . 


425 


20 . 


992 


50 . 


658 


1 . 


00 


54 . 


67 


1DLC3 14 5 


ATOM 


3 022 


C 


THR 


431 


52 . 


553 


21 . 


360 


49 . 


179 


1 . 


00 


53 . 


42 


1DLC3146 


ATOM 


3023 


O 


THR 


4 31 


52 . 


775 


20 . 


505 


48 . 


3 34 


1 . 


00 


52 . 


29 


1DLC314 7 


ATOM 


3024 


CB 


THR 


431 


51 . 


786 


19 . 


594 


50 . 


7 60 


1 . 


00 


54 . 


43 


1DLC3148 


Ai UM 


3 02 5 


OG1 


THR 


4 31 


50 . 


488 


19 . 


617 


50 . 


155 


1 . 


00 


55 . 


57 


1DLC3 14 9 


ATOM 


3 02 6 


CG2 


THR 


4 31 


51 . 


627 


19 . 


188 


52 . 


2 23 


1 . 


00 


54 . 


20 


1DLC3 150 


ATOM 


3 027 


N 


ASP 


4 3 2 


52 . 


399 


22 . 


643 


48 . 


883 


1 . 


00 


55 . 


66 


1DLC3 151 


ATOM 


3028 


CA 


ASP 


432 


52 . 


476 


23 . 


176 


47 . 


526 


1 . 


00 


59 . 


05 


1DLC3152 


ATOM 


3029 


C 


ASP 


432 


52 . 


136 


22 . 


273 


46 . 


343 


1 . 


00 


59 . 


61 


1DLC3153 


ATOM 


3030 


O 


ASP 


432 


52. 


945 


22. 


109 


45. 


428 


1. 


00 


61. 


57 


1DLC3154 


ATOM 


3031 


CB 


ASP 


432 


53 . 


816 


23 . 


875 


47. 


288 


1 . 


00 


63. 


68 


1DLC3155 


ATOM 


3032 


CG 


ASP 


432 


53. 


877 


25 . 


252 


47. 


934 


1 . 


00 


69. 


48 


1DLC3156 


ATOM 


3033 


OD1 


ASP 


432 


53 . 


439 


26 . 


238 


47. 


293 


1 . 


00 


70. 


82 


1DLC3157 


ATOM 


3034 


OD2 


ASP 


432 


54 . 


360 


25. 


347 


49. 


086 


1 . 


00 


71. 


66 


1DLC3158 


ATOM 


3035 


N 


GLU 


433 


50 . 


928 


21. 


711 


46 . 


346 


1 . 


00 


59. 


44 


1DLC3159 
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ATOM 


3036 


CA 


GLU 


433 


50 


.480 


20 


. 371 


45 


. 233 


1 


.00 


57 


.29 


1DLC3160 


ATOM 


3037 


C 


GLU 


433 


48 


. 957 


20 


.787 


45 


. 047 


L 


.00 


52 


. 34 


1DLC3161 


ATOM 


3038 


0 


GLU 


433 


48 


. 178 


20 


.915 


45 


. 994 


1 


. 00 


46 


.51 


1DLC3162 


ATOM 


3039 


CB 


GLU 


433 


51 


. 137 


19 


.476 


45 


. 260 


1 


.00 


63 


.46 


1DLC3163 


ATOM 


3040 


CG 


GLU 


433 


50 


.290 


18 


.331 


45 


. 795 


1 


.00 


69 


.79 


1DLC3164 


ATOM 


3041 


CD 


GLU 


433 


50 


.272 


18 


.275 


47 


. 305 


1 


.00 


74 


.20 


1DLC3165 


ATOM 


3042 


OE1 


GLU 


433 


51 


.179 


17 


.634 


47 


.887 


1 


.00 


75 


.74 


1DLC3166 


ATOM 


3043 


OE2 


GLU 


433 


49 


.349 


18 


.868 


47 


. 907 


1 


.00 


78 


.38 


1DLC3167 


ATOM 


3044 


N 


ALA 


434 


48 


. 553 


20 


.634 


43 


. 792 


1 


.00 


51 


. 12 


1DLC3168 


ATOM 


3045 


CA 


ALA 


434 


47 


.146 


20 


.552 


43 


.419 


1 


.00 


50 


.02 


1DLC3169 


ATOM 


3046 


C 


ALA 


434 


46 


.636 


19 


.113 


43 


.301 


1 


. 00 


48 


. 51 


1DLC3170 


ATOM 


3047 


0 


ALA 


434 


47 


.418 


18 


.168 


43 


. 249 


1 


.00 


51 


.46 


1DLC3171 


ATOM 


3048 


CB 


ALA 


434 


46 


.913 


21 


.302 


42 


. 107 


1 


.00 


49 


. 76 


1DLC3172 


ATOM 


3049 


N 


SER 


435 


45 


.316 


18 


. 969 


43 


.250 


1 


.00 


44 


.96 


1DLC3173 


ATOM 


3050 


CA 


SER 


435 


44 


.650 


17 


.674 


43 


. 139 


1 


. 00 


38 


. 99 


1DLC3174 


ATOM 


3051 


C 


SER 


435 


43 


. 159 


17 


. 914 


42 


. 940 


1 


.00 


39 


.56 


1DLC3175 


ATOM 


3052 


0 


SER 


435 


42 


.693 


19 


.048 


42 


. 999 


1 


.00 


38 


. 70 


1DLC3176 


ATOM 


3053 


CB 


SER 


435 


44 


.845 


16 


.857 


44 


,415 


1 


.00 


35 


.36 


1DLC3177 


ATOM 


3054 


OG 


SER 


435 


44 


.241 


17 


.492 


45 


.528 


1 


.00 


32 


.98 


1DLC3178 


ATOM 


3055 


N 


THR 


436 


42 


.407 


16 


. 847 


42 


.705 


1 


.00 


38 


.68 


1DLC3179 


ATOM 


3056 


CA 


THR 


436 


40 


.967 


16 


.979 


42 


.526 


1 


.00 


36 


.86 


1DLC3180 


ATOM 


3057 


C 


THR 


436 


40 . 


.171 


16, 


.007 


43 , 


,392 


1 


. 00 


36 


. 34 


1DLC3181 


ATOM 


3058 


0 


THR 


436 


40. 


,679 


14 


. 970 


43 , 


. 823 


1 


.00 


37 


.73 


1DLC3182 


ATOM 


3059 


CB 


THR 


436 


40 . 


,540 


16 


. 757 


41 


.065 


1 


.00 


35 


.68 


1DLC3183 


ATOM 


3060 


OG1 


THR 


436 


40 , 


,788 


15 


.397 


40, 


.699 


1 


.00 


41 


,63 


1DLC3184 


ATOM 


3061 


CG2 


THR 


436 


41 . 


,313 


17 


.674 


40 , 


. 130 


1 


. 00 


37 


.09 


1DLC3185 


ATOM 


3062 


N 


GLN 


437 


38. 


,947 


16 , 


.407 


43 . 


,713 


1 


.00 


35 


.55 


1DLC3186 


ATOM 


3063 


CA 


GLN 


437 


38 . 


,009 


15 


.597 


44 , 


.485 


1 


. 00 


33 


.02 


1DLC3187 


ATOM 


3064 


C 


GLN 


437 


36 . 


,709 


15, 


.729 


43 , 


,723 


1 


.00 


34 , 


.09 


1DLC3188 


ATOM 


3065 


0 


GLN 


437 


36 . 


391 


16. 


.809 


43 . 


227 


1 . 


.00 


34 . 


. 18 


1DLC3189 


ATOM 


3066 


CB 


GLN 


437 


37. 


,813 


16. 


.140 


45. 


,894 


1 , 


.00 


30 . 


.43 


1DLC3190 


ATOM 


3067 


CG 


GLN 


437 


38. 


812 


15. 


.635 


46 . 


887 


1. 


.00 


30. 


,16 


1DLC3191 


ATOM 


3068 


CD 


GLN 


437 


38. 


,380 


15 , 


.896 


48 . 


,305 


1 


.00 


32. 


.30 


1DLC3192 


ATOM 


3069 


OE1 


GLN 


437 


37. 


,962 


14 . 


,984 


49. 


015 


1 . 


. 00 


36. 


,80 


1DLC3193 


ATOM 


3070 


NE2 


GLN 


437 


38 . 


458 


17. 


,144 


48. 


726 


1 , 


.00 


34 . 


86 


1DLC3194 


ATOM 


3071 


N 


THR 


438 


35. 


966 


14 . 


644 


43 . 


575 


1 . 


, 00 


34 . 


88 


1DLC3195 


ATOM 


3072 


CA 


THR 


438 


34 , 


,725 


14 . 


.759 


42 . 


828 


1 , 


.00 


35. 


, 70 


1DLC3196 


ATOM 


3073 


C 


THR 


438 


33 . 


533 


14 . 


,042 


43 . 


,417 


1 , 


, 00 


34 . 


07 


1DLC3197 


ATOM 


3074 


0 


THR 


438 


33. 


663 


13. 


.090 


44 . 


177 


1 . 


.00 


35. 


71 


1DLC3198 


ATOM 


3075 


CB 


THR 


438 


34. 


857 


14 . 


280 


41. 


346 


1 . 


.00 


37. 


66 


1DLC3199 


ATOM 


3076 


OGl 


THR 


438 


34. 


,466 


12 . 


,905 


41. 


251 


1 , 


,00 


37 . 


09 


1DLC3200 


ATOM 


3077 


CG2 


THR 


438 


36, 


286 


14 , 


.453 


40 . 


812 


1 . 


.00 


32. 


87 


1DLC3201 


ATOM 


3078 


N 


TYR 


439 


32 . 


366 


14 . 


560 


43 . 


074 


1 . 


,00 


34 . 


88 


1DLC3202 


ATOM 


3079 


CA 


TYR 


439 


31. 


100 


13 . 


988 


43 . 


473 


1 . 


,00 


32 . 


18 


1DLC3203 


ATOM 


3080 


C 


TYR 


439 


30. 


575 


13 . 


313 


42. 


197 


1 . 


00 


32. 


47 


1DLC3204 


ATOM 


3081 


0 


TYR 


439 


30. 


636 


13 . 


888 


41 . 


102 


1 . 


00 


30. 


88 


1DLC3205 


ATOM 


3082 


CB 


TYR 


439 


30. 


141 


15. 


089 


43 . 


936 


1 . 


00 


29. 


14 


1DLC3206 


ATOM 


3083 


CG 


TYR 


439 


28. 


691 


14 . 


683 


43 . 


824 


1. 


00 


30. 


66 


1DLC3207 


ATOM 


3084 


CD1 


TYR 


439 


28. 


126 


13 . 


796 


44 . 


738 


1 . 


00 


28 . 


98 


1DLC3208 


ATOM 


3085 


CD2 


TYR 


439 


27. 


902 


15. 


117 


42 . 


751 


1 . 


00 


27. 


98 


1DLC3 20 9 


ATOM 


3086 


CE1 


TYR 


439 


26 . 


825 


13 . 


344 


44 . 


588 


1 . 


00 


29. 


00 


1DLC3210 


ATOM 


3087 


CE2 


TYR 


439 


26. 


598 


14 . 


667 


42 . 


595 


1 . 


00 


26. 


79 


1DLC3 211 


ATOM 


3088 


cz 


TYR 


439 


26. 


072 


13 . 


778 


43 . 


518 


1 . 


00 


25. 


85 


1DLC3212 


ATOM 


3089 


OH 


TYR 


439 


24. 


798 


13. 


305 


43 . 


369 


1 . 


00 


25. 


37 


1DLC3213 


ATOM 


3090 


N 


ASP 


440 


30. 


098 


12. 


082 


42. 


323 


1 . 


00 


33. 


95 


1DLC3214 


ATOM 


3091 


CA 


ASP 


440 


29. 


568 


11. 


366 


41. 


162 


1 . 


00 


36 . 


24 


1DLC3215 


ATOM 


3092 


C 


ASP 


440 


28. 


231 


10 . 


735 


41 . 


551 


1 . 


00 


37. 


15 


1DLC3216 


ATOM 


3093 


0 


ASP 


440 


28. 


152 


10. 


014 


42 . 


547 


1 . 


00 


37. 


73 


1DLC3217 


ATOM 


3094 


CB 


ASP 


440 


30. 


560 


10. 


285 


40. 


711 


1 . 


00 


37. 


85 


1DLC3218 


ATOM 


3095 


CG 


ASP 


440 


30. 


332 


9. 


831 


39. 


277 


1 . 


00 


40. 


56 


1DLC3219 
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ATOM 


3096 


OD1 ASP 


440 


29 


.253 


9 


.284 


38 


. 975 


1 


.00 


45 


. 14 




1DLC3220 


ATOM 


3097 


OD2 ASP 


440 


31 


.246 


10 


.006 


38 


.442 


1 


. 00 


44 


. 83 




1DLC3221 


ATOM 


3098 


N 


SER 


441 


27 


. 177 


11 


. 035 


40 


.794 


1 


. 00 


36 


. 81 




1DLC3222 


ATOM 


3099 


CA 


SER 


441 


25 


.851 


10 


.483 


41 


.087 


1 


. 00 


36 


. 80 




1DLC3223 


ATOM 


3100 


C 


SER 


441 


25 


.825 


8 


. 972 


40 


.382 


1 


. 00 


39 


. 60 




1DLC3224 


ATOM 


3101 


O 


SER 


441 


25 


. 000 


8 


.271 


41 


.467 


1 


. 00 


40 


. 38 




1DLC3225 


ATOM 


3102 


CB 


SER 


441 


.24 


. 781 


11 


. 133 


40 


.212 


1 


. 00 


33 


. 16 




1DLC3226 


ATOM 


3103 


OG 


SER 


441 


24 


. 910 


10 


. 741 


38 


.857 


1 


. 00 


34 


.23 




1DLC3227 


ATOM 


3104 


N 


LYS 


442 


26 


.738 


8 


.495 


40 


.039 


1 


. 00 


41 


. 35 




1DLC3228 


ATOM 


3105 


CA 


LYS 


442 


26 


. 901 


7 


. 078 


39 


.701 


1 


. 00 


45 


.48 




1DLC3229 


ATOM 


3106 


C 


LYS 


442 


25 


. 977 


6 


. 509 


38 


.618 


1 


. 00 


46 


. 07 




1DLC3230 


ATOM 


3107 


O 


LYS 


442 


26 


. 260 


5 


. 446 


38 


.055 


1 


. 00 


49 


. 36 




1DLC3231 


ATOM 


3108 


CB 


LYS 


442 


26 


. 905 


6 


. 192 


40 


.953 


1 


. 00 


49 


. 45 




1DLC3232 


ATOM 


3109 


CG 


LYS 


442 


28 


.279 


5 


. 606 


41 


.309 


1 


. 00 


56 


. 04 




1DLC3233 


ATOM 


3110 


CD 


LYS 


442 


28 


. 626 


4 


. 379 


40 


.446 


1 


. 00 


58 


. 91 




1DLC3234 


ATOM 


3111 


CE 


LYS 


442 


29 


. Ill 


4 


. 762 


39 


.050 


1 


. 00 


60 


. 19 




1DLC3235 


ATOM 


'3112 


NZ 


LYS 


442 


28 


. 866 


3 


. 683 


38 


.044 


1 


. 00 


60 


. 75 




1DLC3236 


ATOM 


3113 


N 


ARG 


443 


24 


. 885 


7 


. 201 


38 , 


.310 


1 


. 00 


43 


. 51 


1 


1DLC3237 


ATOM 


3114 


CA 


ARG 


443 


24 


. 001 


6 . 


. 738 


37 . 


.247 


1 


. 00 


40 , 


. 12 


1 


1DLC323 8 


ATOM 


3115 


C 


ARG 


44 3 


23 


. 716 


7 


. 807 


36 . 


. 191 


1 


. 00 


39 


. 29 


1 


1DLC3239 


ATOM 


3116 


0 


ARG 


443 


23 


.231 


8 . 


. 897 


36 . 


. 4 98 


1 


. 00 


3 9 


. 09 


1 


1DLC3240 


ATOM 


3117 


CB 


AARG 


443 


22 


.686 


6 . 


. 168 


37 . 


. 792 


0 


. 50 


39 . 


. 58 


1 


1DLC324 1 


ATOM 


3118 


CB 


BARG 


443 


22 . 


. 701 


6 , 


. 194 


37 . 


. 849 


0 . 


. 50 


39 , 


. 12 


1 


1DLC3242 


ATOM 


3119 


CG 


AARG 


443 


21 . 


. 828 


5 , 


. 580 


36 . 


, 678 


0 , 


. 50 


38 . 


.20 


1 


1DLC3243 


ATOM 


3120 


CG 


BARG 


443 


21 . 


. 551 


6 . 


.015 


36 . 


. 868 * 


0 . 


. 50 


37 . 


50 


1 


1DLC3244 


ATOM 


3121 


CD 


AARG 


443 


20 . 


703 


4 . 


678 


37 . 


156 


0 . 


. 50 


38 . 


35 


1 


1DLC3245 


ATOM 


3122 


CD 


BARG 


443 


20 . 


. 501 


5 . 


.057 


37 . 


.413 


0 . 


. 50 


36 . 


, 41 


1 


1DLC3246 


ATOM 


3123 


NE 


AARG 


44 3 


20 . 


. 266 


3 , 


. 824 


36 . 


. 050 


0 , 


. 50 


3 8 . 


. 24 


1 


1DLC3247 


ATOM 


3124 


NE 


BARG 


443 


20 . 


.317 


5 . 


. 184 


38 . 


8 57 


0 . 


. 50 


34 . 


, 60 


1 


1DLC3248 


ATOM 


3125 


CZ 


AARG 


443 


19 . 


076 


3 , 


. 238 


35 . 


951 


0 . 


. 50 


37 . 


, 74 


1 


1DLC324 9 


ATOM 


3126 


CZ 


BARG 


443 


19 . 


823 


4 , 


226 


39 . 


636 


0 . 


50 


34 . 


26 


1 


1DLC3250 


ATOM 


3127 


NH1AARG 


443 


18 . 


160 


3 . 


392 


36 . 


895 


0 . 


50 


38 . 


52 


1 


1DLC3251 


ATOM 


3128 


NH1BARG 


443 


19 . 


.452 


3 . 


. 064 


39 . 


110 


0 . 


, 50 


31 . 


. 83 


1 


1DLC3252 


ATOM 


3129 


NH2AARG 


443 


18 . 


798 


2 . 


508 


34 . 


882 


0 , 


, 50 


37 . 


87 


1 


1DLC3253 


ATOM 


3130 


NH2BARG 


443 


19 . 


727 


4 . 


,418 


40 . 


948 


0 . 


. 50 


31 . 


14 


1 


1DLC3254 


ATOM 


3131 


N 


ASN 


444 


24 . 


070 


7 . 


481 


34 . 


950 


1 . 


00 


40 . 


97 




1DLC3 255 


ATOM 


3132 


CA 


ASN 


444 


23 . 


, 882 


8 . 


350 


33 . 


784 


1 . 


00 


41. 


75 




1DLC3256 


ATOM 


3133 


C 


ASN 


444 


24 . 


271 


9 . 


814 


34 . 


045 


1 . 


00 


42 . 


79 




1DLC3 257 


ATOM 


3134 


O 


ASN 


444 


23 . 


421 


10 . 


700 


34 . 


174 


1 . 


00 


42 . 


71 




1DLC32 58 


ATOM 


3135 


CB 


ASN 


444 


22 . 


, 446 


8 . 


.230 


33 . 


271 


1 . 


, 00 


43 . 


,21 




1DLC3 25 9 


ATOM 


3136 


CG 


ASN 


444 


22 . 


.280 


8 . 


764 


31 . 


862 


1 . 


. 00 


44 , 


, 75 




1DLC3260 


ATOM 


3137 


OD1 


. ASN 


444 


23 . 


.258 


9 . 


086 


31 . 


182 


1 . 


, 00 


47 . 


,47 




1DLC3261 


ATOM 


313 8 


ND2 


1 ASN 


444 


21 . 


. 040 


8 . 


. 867 


31 . 


418 


1 , 


. 00 


45 . 


66 




1DLC3262 


ATOM 


3139 


N 


VAL 


445 


25 . 


579 


10 . 


, 041 


34 . 


079 


1 . 


00 


42 . 


96 




1DLC3 263 


ATOM 


3140 


CA 


VAL 


445 


26 . 


195 


11 . 


345 


34 . 


342 


1 . 


00 


41 . 


13 




1DLC32 64 


ATOM 


3141 


C 


VAL 


445 


26 . 


.035 


12 . 


.460 


33 . 


282 


1 , 


.00 


39 . 


.24 




1DLC3265 


ATOM 


3142 


O 


VAL 


445 


26 . 


. 037 
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All of the compositions and methods disclosed and claimed herein can be made 
and executed without undue experimentation in light of the present disclosure. While the 
compositions and methods of this invention have been described in terms of preferred em- 
bodiments, it will be apparent to those of skill in the art that variations may be applied to the 
5 compositions and methods and in the steps or in the sequence of steps of the method de- 
scribed herein without departing from the concept, spirit and scope of the invention. More 
specifically, it will be apparent that certain agents which are both chemically and physiologi- 
cally related may be substituted for the agents described herein while the same or similar re- 
sults would be achieved. All such similar substitutes and modifications apparent to those 
10 skilled in the art are deemed to be within the spirit, scope and concept of the invention as de- 
fined by the appended claims. 
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